使用Ruby 1.9.3上的大括号进行全局处理

如果您使用File :: FNM_EXTGLOB选项,则最新版本的Ruby支持在globbing中使用大括号

从2.2.0文档

File.fnmatch('c{at,ub}s', 'cats', File::FNM_EXTGLOB) #=> true # { } is supported on FNM_EXTGLOB 

但是,1.9.3文档说它在1.9.3中不受支持:

 File.fnmatch('c{at,ub}s', 'cats') #=> false # { } isn't supported 

(另外,尝试使用File::FNM_EXTGLOB给出了一个名称错误)

有没有办法在Ruby 1.9.3中使用大括号,比如第三方gem?

我想要匹配的字符串来自S3,而不是本地文件系统,所以我不能只是要求操作系统根据我的知识进行通配。

我正在打包一个Ruby Backport来支持braces。 以下是该解决方案的基本部分:

 module File::Constants FNM_EXTGLOB = 0x10 end class << File def fnmatch_with_braces_glob(pattern, path, flags =0) regex = glob_convert(pattern, flags) return regex && path.match(regex).to_s == path end def fnmatch_with_braces_glob?(pattern, path, flags =0) return fnmatch_with_braces_glob(pattern, path, flags) end private def glob_convert(pattern, flags) brace_exp = (flags & File::FNM_EXTGLOB) != 0 pathnames = (flags & File::FNM_PATHNAME) != 0 dot_match = (flags & File::FNM_DOTMATCH) != 0 no_escape = (flags & File::FNM_NOESCAPE) != 0 casefold = (flags & File::FNM_CASEFOLD) != 0 syscase = (flags & File::FNM_SYSCASE) != 0 special_chars = ".*?\\[\\]{},.+()|$^\\\\" + (pathnames ? "/" : "") special_chars_regex = Regexp.new("[#{special_chars}]") if pattern.length == 0 || !pattern.index(special_chars_regex) return Regexp.new(pattern, casefold || syscase ? Regexp::IGNORECASE : 0) end # Convert glob to regexp and escape regexp characters length = pattern.length start = 0 brace_depth = 0 new_pattern = "" char = "/" loop do path_start = !dot_match && char[-1] == "/" index = pattern.index(special_chars_regex, start) if index new_pattern += pattern[start...index] if index > start char = pattern[index] snippet = case char when "?" then path_start ? (pathnames ? "[^./]" : "[^.]") : ( pathnames ? "[^/]" : ".") when "." then "\\." when "{" then (brace_exp && (brace_depth += 1) >= 1) ? "(?:" : "{" when "}" then (brace_exp && (brace_depth -= 1) >= 0) ? ")" : "}" when "," then (brace_exp && brace_depth >= 0) ? "|" : "," when "/" then "/" when "\\" if !no_escape && index < length next_char = pattern[index += 1] special_chars.include?(next_char) ? "\\#{next_char}" : next_char else "\\\\" end when "*" if index+1 < length && pattern[index+1] == "*" char += "*" if pathnames && index+2 < length && pattern[index+2] == "/" char += "/" index += 2 "(?:(?:#{path_start ? '[^.]' : ''}[^\/]*?\\#{File::SEPARATOR})(?:#{!dot_match ? '[^.]' : ''}[^\/]*?\\#{File::SEPARATOR})*?)?" else index += 1 "(?:#{path_start ? '[^.]' : ''}(?:[^\\#{File::SEPARATOR}]*?\\#{File::SEPARATOR}?)*?)?" end else path_start ? (pathnames ? "(?:[^./][^/]*?)?" : "(?:[^.].*?)?") : (pathnames ? "[^/]*?" : ".*?") end when "[" # Handle character set inclusion / exclusion start_index = index end_index = pattern.index(']', start_index+1) while end_index && pattern[end_index-1] == "\\" end_index = pattern.index(']', end_index+1) end if end_index index = end_index char_set = pattern[start_index..end_index] char_set.delete!('/') if pathnames char_set[1] = '^' if char_set[1] == '!' (char_set == "[]" || char_set == "[^]") ? "" : char_set else "\\[" end else "\\#{char}" end new_pattern += snippet else if start < length snippet = pattern[start..-1] new_pattern += snippet end end break if !index start = index + 1 end begin return Regexp.new("\\A#{new_pattern}\\z", casefold || syscase ? Regexp::IGNORECASE : 0) rescue return nil end end end 

此解决方案考虑了File :: fnmatch函数可用的各种标志,并使用glob模式构建合适的Regexp以匹配这些function。 使用此解决方案,可以成功运行这些测试:

 File.fnmatch('c{at,ub}s', 'cats', File::FNM_EXTGLOB) #=> true File.fnmatch('file{*.doc,*.pdf}', 'filename.doc') #=> false File.fnmatch('file{*.doc,*.pdf}', 'filename.doc', File::FNM_EXTGLOB) #=> true File.fnmatch('f*l?{[az].doc,[0-9].pdf}', 'filex.doc', File::FNM_EXTGLOB) #=> true File.fnmatch('**/.{pro,}f?l*', 'home/.profile', File::FNM_EXTGLOB | File::FNM_DOTMATCH) #=> true 

fnmatch_with_braces_glob (和? variant)将被修补以代替fnmatch ,因此符合Ruby 2.0.0的代码也适用于早期的Ruby版本。 为清楚起见,上面显示的代码不包括一些性能改进,参数检查或Backportsfunction检测和补丁代码; 这些显然将包含在项目的实际提交中。

我还在测试一些边缘情况并大大优化性能; 它应该准备很快提交。 一旦它在官方Backports版本中可用,我将在此处更新状态。

请注意, Dir :: glob支持也将同时出现。

这是一个有趣的Ruby练习! 不知道这个解决方案是否足够强大,但是这里有:

 class File class << self def fnmatch_extglob(pattern, path, flags=0) explode_extglob(pattern).any?{|exploded_pattern| fnmatch(exploded_pattern,path,flags) } end def explode_extglob(pattern) if match=pattern.match(/\{([^{}]+)}/) then subpatterns = match[1].split(',',-1) subpatterns.map{|subpattern| explode_extglob(match.pre_match+subpattern+match.post_match)}.flatten else [pattern] end end end end 

需要更好的测试,但它似乎适用于简单的情况:

 [2] pry(main)> File.explode_extglob('c{at,ub}s') => ["cats", "cubs"] [3] pry(main)> File.explode_extglob('c{at,ub}{s,}') => ["cats", "cat", "cubs", "cub"] [4] pry(main)> File.explode_extglob('{a,b,c}{d,e,f}{g,h,i}') => ["adg", "adh", "adi", "aeg", "aeh", "aei", "afg", "afh", "afi", "bdg", "bdh", "bdi", "beg", "beh", "bei", "bfg", "bfh", "bfi", "cdg", "cdh", "cdi", "ceg", "ceh", "cei", "cfg", "cfh", "cfi"] [5] pry(main)> File.explode_extglob('{a,b}c*') => ["ac*", "bc*"] [6] pry(main)> File.fnmatch('c{at,ub}s', 'cats') => false [7] pry(main)> File.fnmatch_extglob('c{at,ub}s', 'cats') => true [8] pry(main)> File.fnmatch_extglob('c{at,ub}s*', 'catsssss') => true 

使用Ruby 1.9.3和Ruby 2.1.5和2.2.1进行测试。