Tag: 正则表达式

文件系统爬虫 – 迭代错误: 我目前正在使用以下代码构建文件系统爬网程序： require ‘find’ require ‘spreadsheet’ Spreadsheet.client_encoding = ‘UTF-8’ count = 0 Find.find(‘/Users/Anconia/crawler/’) do |file| if file =~ /\b.xls$/ # check if filename ends in desired format contents = Spreadsheet.open(file).worksheets contents.each do |row| if row =~ /regex/ puts file count += 1 end end end end puts “#{count} files were found” 我收到以下输出： 0 files were found […]

如何在最接近的时间段内选择前280个单词？: 我需要从较长的单词中提取指定数量单词的较短段文本。我可以这样做 text = “There was a very big cat that was sitting on the ledge. It was overlooking the garden. The dog next door watched with curiosity.” text.split[0..15].join(‘ ‘) >>””There was a very big cat that was sitting on the ledge. It was overlooking” 我想选择下一期的文本，所以我最终不会得到部分句子。是否有一种方法可能使用正则表达式来完成我正在尝试做的事情，这将能够使文本达到并包括在第15个单词之后最接近的下一个时期？

Ruby regex：从字符串中提取url列表: 我有一串图像的URL，我需要将其转换为数组。 http://rubular.com/r/E2a5v2hYnJ 我该怎么做呢？

Ruby .split（）正则表达式: 我正在尝试将字符串”[test| blah] [foo |bar][test|abc]”拆分为以下数组： [ [“test”,”blah”] [“foo”,”bar”] [“test”,”abc”] ] 但我无法正确表达我的正则表达式。 ruby： @test = ‘[test| blah] [foo |bar][test|abc]’.split(%r{\s*\]\s*\[\s*}) @test.each_with_index do |test, i| @test[i] = test.split(%r{\s*\|\s*}) end 我不在那里，这回来了： [ [ “[test” , “blah” ] [ “foo” , “bar” ] [ “test” , “abc]” ] ] 实现这一目标的正确正则表达式是什么？如果我还可以考虑新行，那就太好了，比如说： “[test| blah] \n [foo |bar]\n[test|abc]”

从String中删除重复的数字和运算符: 我正在尝试使用一个简单的数学表达式字符串，删除所有空格，删除所有重复的运算符，转换为单个数字，然后进行求值。例如，类似“2 7 + * 3 * 95”的字符串应转换为“2 + 3 * 9”，然后评估为29。这是我到目前为止所拥有的： expression.slice!(/ /) # Remove whitespace expression.slice!(/\A([\+\-\*\/]+)/) # Remove operators from the beginning expression.squeeze!(“0123456789”) # Single digit numbers (doesn’t work) expression.squeeze!(“+-*/”) # Removes duplicate operators (doesn’t work) expression.slice!(/([\+\-\*\/]+)\Z/) # Removes operators from the end puts eval expression 不幸的是，这不会产生单个数字，也不会像我预期的那样删除重复的运算符。有任何想法吗？

使用Ruby将String的开头或结尾与子字符串进行比较的最快方法是什么？: 切片字符串”Hello world!”[0, 5] == ‘Hello’ 0,5 “Hello world!”[0, 5] == ‘Hello’是Ruby中常用的一种习惯用法，用于将字符串的前n个或后n个字符与另一个字符串进行比较。正则表达式也可以这样做。然后是start_with? 和end_with? 这也可以做到这一点。我应该以最快的速度使用哪种？

Ruby：在字符串中查找前N个正则表达式匹配（并停止扫描）: 想要扫描非常长的字符串以进行正则表达式匹配。想知道找到第一个N正则表达式的最有效方法是什么。例如： ‘abcabcabc’.scan /b/, limit: 2 如果只扫描支持限制选项，则会在5个字符后成功结束。（该字符串是几MB – 内存中的一个memoized数据结构 – 这是一个Web请求.Per很重要。）

正则表达式捕获具有多行值的冒号分隔的键值对: 我目前正在使用Ruby on Rails（在Eclipse中）开发一个项目，我的任务是使用正则表达式将一个数据块拆分成相关的部分。我决定根据3个参数分解数据：该行必须以大写字母开头（RegEx等价 – /^[AZ]/ ）它必须以:( RegEx等价物 – /$”:”/ ）结束我会感激任何帮助….我在我的控制器中使用的代码是： @f = File.open(“report.rtf”) @fread = @f.read @chunk = @fread.split(/\n/) 其中@chunk是将由拆分创建的数组， @fread是正在拆分的数据（按新行）。任何帮助将不胜感激，非常感谢！我无法发布确切的数据，但它基本上是由此（这在医学上相关）考试1：CBW 8080 结果：该报告由具体测量决定。请参阅原始报告。比较：2012年1月30日，3/8/12，4/9/12 RECIST 1.1：BLAH BLAH BLAH 理想的输出是一个数组，表示： [“Exam 1:”, “CBW 8080”, “RESULT”, “This report is dictated with specific measurement. Please see the original report.”, […]