如何在字符串中获得可能重叠的匹配

我正在寻找一种方法，无论是在Ruby还是Javascript中，它都会在字符串中为正则表达式提供所有匹配，可能重叠。

假设我有str = "abcadc" ，我想找到a后跟任意数量的字符，然后是c 。我正在寻找的结果是["abc", "adc", "abcadc"] 。关于如何实现这一目标的任何想法？

str.scan(/a.*c/)会给我["abcadc"] ， str.scan(/(?=(a.*c))/).flatten会给我["abcadc", "adc"] 。

 def matching_substrings(string, regex) string.size.times.each_with_object([]) do |start_index, maching_substrings| start_index.upto(string.size.pred) do |end_index| substring = string[start_index..end_index] maching_substrings.push(substring) if substring =~ /^#{regex}$/ end end end matching_substrings('abcadc', /a.*c/) # => ["abc", "abcadc", "adc"] matching_substrings('foobarfoo', /(\w+).*\1/) # => ["foobarf", # "foobarfo", # "foobarfoo", # "oo", # "oobarfo", # "oobarfoo", # "obarfo", # "obarfoo", # "oo"] matching_substrings('why is this downvoted?', /why.*/) # => ["why", # "why ", # "why i", # "why is", # "why is ", # "why is t", # "why is th", # "why is thi", # "why is this", # "why is this ", # "why is this d", # "why is this do", # "why is this dow", # "why is this down", # "why is this downv", # "why is this downvo", # "why is this downvot", # "why is this downvote", # "why is this downvoted", # "why is this downvoted?"]

在Ruby中，您可以使用以下方法获得预期结果：

 str = "abcadc" [/(a[^c]*c)/, /(a.*c)/].flat_map{ |pattern| str.scan(pattern) }.reduce(:+) # => ["abc", "adc", "abcadc"]

这种方式是否适合您，高度依赖于您真正想要实现的目标。

我试着把它放到一个单独的表达式中，但我无法使它工作。我真的想知道是否有一些科学原因，这不能通过正则表达式解析，或者我只是不太了解Ruby的解析器Oniguruma来做到这一点。

您想要所有可能的匹配，包括重叠匹配。正如您所指出的那样，“ 如何找到与正则表达式重叠匹配？ ”的前瞻技巧对您的情况不起作用。

在一般情况下，我唯一能想到的就是生成字符串的所有可能的子串，并根据正则表达式的锚定版本检查每个子串。这是蛮力，但它的作用。

ruby：

 def all_matches(str, regex) (n = str.length).times.reduce([]) do |subs, i| subs += [*i..n].map { |j| str[i,ji] } end.uniq.grep /^#{regex}$/ end all_matches("abcadc", /a.*c/) #=> ["abc", "abcadc", "adc"]

使用Javascript：

 function allMatches(str, regex) { var i, j, len = str.length, subs={}; var anchored = new RegExp('^' + regex.source + '$'); for (i=0; i



		      	 在JS中： 


 function doit(r, s) { var res = [], cur; r = RegExp('^(?:' + r.source + ')$', r.toString().replace(/^[\s\S]*\/(\w*)$/, '$1')); r.global = false; for (var q = 0; q < s.length; ++q) for (var w = q; w <= s.length; ++w) if (r.test(cur = s.substring(q, w))) res.push(cur); return res; } document.body.innerHTML += "" + JSON.stringify(doit( /a.*c/g, 'abcadc' ), 0, 4) + "
"; 




		      	 ▶ str = "abcadc" ▶ from = str.split(/(?=\p{L})/).map.with_index { |c, i| i if c == 'a' }.compact ▶ to = str.split(/(?=\p{L})/).map.with_index { |c, i| i if c == 'c' }.compact ▶ from.product(to).select { |f,t| f < t }.map { |f,t| str[f..t] } #⇒ [ # [0] "abc", # [1] "abcadc", # [2] "adc" # ] 
 我相信，有一种奇特的方法来查找字符串中字符的所有索引，但我无法找到它:(任何想法？ 
 拆分“unicode char boundary”可以使用'ábĉ'或'Üve Østergaard'等字符串。 
 对于更通用的解决方案，它接受任何“from”和“to”序列，应该引入一点修改：在字符串中查找“from”和“to”的所有索引。 



		      	 这是一种类似于@ndn和@ Mark的方法，适用于任何字符串和正则表达式。 我已经将它实现为String一种方法，因为这是我希望看到它的地方。 对String#[]和String#scan不是很好的恭维吗？ 
 class String def all_matches(regex) return [] if empty? r = /^#{regex}$/ 1.upto(size).with_object([]) { |i,a| a.concat(each_char.each_cons(i).map(&:join).select { |s| s =~ r }) } end end 'abcadc'.all_matches /a.*c/ # => ["abc", "abcadc", "adc"] 'aaabaaa'.all_matches(/a.*a/) #=> ["aa", "aa", "aa", "aa", "aaa", "aba", "aaa", "aaba", "abaa", "aaaba", # "aabaa", "abaaa", "aaabaa", "aabaaa", "aaabaaa"] 



		      	  RegExp /(ac)|(a.*c)/g是匹配"a"字符后跟任何字符后跟"c" ;  "a.*c"是匹配"a"后跟任何字符后跟前面的字符后跟"c"字符; 注意(a.*c) RegExp可能会得到改善。 条件为if检查输入字符串中的最后一个字符是否为"c" ，如果为true ，则将完整输入字符串推送到res结果数组 


 var str = "abcadc" , res = str.match(/(ac)|(a.*c)/g); if (str[str.length - 1] === "c") res.push(str); document.body.textContent = res.join(" ")



  如何在Watir中获取HTML中具有相同属性的元素数量？
  来自Bundler的新鲜Ruby gem  – 无法加载我的version.rb文件？
	Javascript在dev中工作但不是prod。  寻求建议
Rails 4 Turbolinks使表单多次提交
带有Rails的AJAX  – 缺少模板
rails字段是JSON对象的数组？
Capybara无法识别动态添加的DOM元素？
RoR  – 在rails中上传大文件
如何使用javascript以railsforms获取本地时间？
在rails控制器中，如何防止双重提交（当用户双提交按钮或按两次输入时）？
消失的检查标记之谜

如何在字符串中获得可能重叠的匹配

与gulp-ruby-sass一起吞咽：错误：../ style.css.map：3：1：未知单词

javascript无法找到图像文件（Rails 4 app）

Restangular：错误：未知提供者：RestangularProvider < – Restangular

清单中的ExecJS :: RuntimeError索引错误

Ruby on Rails资产管道无法正常工作

有没有人知道带有分页的动态表的js lib？

预编译后Rails javascript资产丢失

WicketPDF渲染表未正确对齐，页脚位于最后一页

选择’file_field’文件时检查文件扩展名，并使用Ruby on Rails显示错误消息（如果有）

正则表达式validation它的网络路径PHP，jQuery，JavaScript，Ruby