在gsub或扫描中匹配位置

在使用gsubscan时,为每个匹配实现匹配位置(由=~返回的索引)的最佳方法是什么?

 "hello".gsub(/./) { Regexp.last_match.offset(0).first } => "01234" 

请参阅Regexp.last_match和MatchData 。

我从一个不同的方向来解决这个问题,并且无法用gsub或scan(两者都是String类的内置方法)来提供一个像样的解决方案(也就是可理解的,可维护的)。 所以我问“为什么这样做?……”并寻找更自然的替代品(感谢Nash指出大方向!):

 #!/usr/bin/env ruby # -*- encoding: utf-8 -*- # capture_all_matches.rb # # Copyright © 2015 Lorin Ricker  # # This program is free software, under the terms and conditions of the # GNU General Public License published by the Free Software Foundation. # See the file 'gpl' distributed within this project directory tree. # This sample code demonstrates three ways to capture *all* of the offsets # [begin,end,length] data for *all* matches scanned in a source string. # Of course, each/any of the below examples could be turned into a class method # of String &/or Regexp -- One wonders why these are not part of the built-in # classes/methods?... # An example source string (any will do): s = "The fox hides in the box full of sox eating lox." # 4^ 25^ 31^ # Use the literal pattern /f/ as an example -- # there are three "f"s in the sample source string; # see indexes above... p = /f/ # 1. Just report an array of the begin (start) position of each match: mpos = [] m = i = 0 m = p.match( s, i ) { |k| j = k.begin(0); i = j + 1; mpos << j } while m p mpos # => [4, 25, 31] # 2. Make an array containing elements [begin,end] of matched substrings: mpos = [] m = i = 0 m = p.match( s, i ) { |k| j = k.offset(0); i = j[0] + 1; mpos << j } while m p mpos # => [[4, 5], [25, 26], [31, 32]] # 3. Make an array containing elements [begin,end,length] of matched substrings: mpos = [] m = i = 0 m = p.match( s, i ) { |k| j = k.offset(0); i = j[0] + 1; j << j[1] - j[0]; mpos << j } while m p mpos # => [[4, 5, 1], [25, 26, 1], [31, 32, 1]] 

通过将上述内容粘贴到Ruby源文件(例如,capture_all_matches.rb)来演示,然后:

 $ ruby capture_all_matches.rb 

请注意,RegExp匹配方法可以从源字符串中的任意偏移量(重新)开始,因此只需要捕获“最后匹配的偏移量”并从那里进行迭代。

只需要每个匹配的起始偏移量,或者开始 – 结束,还是起始结束长度? 滚动您自己生成的结果数组。

希望这可以帮助。