查找字符串中子字符串的所有索引

我希望能够使用Ruby在更大的字符串中找到所有出现的子字符串的索引。例如：“爱因斯坦”中的所有“in”

str = "Einstein" str.index("in") #returns only 1 str.scan("in") #returns ["in","in"] #desired output would be [1, 6]

标准的黑客是：

 "Einstein".enum_for(:scan, /(?=in)/).map { Regexp.last_match.offset(0).first } #=> [1, 6]

 def indices_of_matches(str, target) sz = target.size (0..str.size-sz).select { |i| str[i,sz] == target } end indices_of_matches('Einstein', 'in') #=> [1, 6] indices_of_matches('nnnn', 'nn') #=> [0, 1, 2]

第二个例子反映了我对重叠字符串处理的假设。如果不考虑重叠的字符串（即， [0, 2]是第二个例子中的期望返回值），则该答案显然是不合适的。

这是一个更冗长的解决方案，它带来了不依赖于全球价值的优势：

 def indices(string, regex) position = 0 Enumerator.new do |yielder| while match = regex.match(string, position) yielder << match.begin(0) position = match.end(0) end end end p indices("Einstein", /in/).to_a # [1, 6]

它输出一个Enumerator ，所以你也可以懒惰地使用它或者只取n第一个索引。

此外，如果您可能需要更多信息而不仅仅是索引，您可以返回MatchData的Enumerator MatchData并提取索引：

 def matches(string, regex) position = 0 Enumerator.new do |yielder| while match = regex.match(string, position) yielder << match position = match.end(0) end end end p matches("Einstein", /in/).map{ |match| match.begin(0) } # [1, 6]

要获得@Cary描述的行为，您可以用block position = match.begin(0) + 1替换block中的最后一行。

查找字符串中子字符串的所有索引

如何让Ruby脚本每秒运行一次？

rvm install 1.9.3失败

无法在rails控制器上的ruby中设置/使用会话变量

有没有办法绕过质量分配保护？

学习Ruby – 1.8或1.9版本？

在创建回购时无法加载gemcocoa豆荚

哪些Ruby类支持.clone？

你能在ruby中的一行上动态初始化多个变量吗？

一个类和Ruby中该类的单例有什么区别？

按此顺序组合2个哈希的最佳方法