Ruby – 优雅地比较两个枚举器

我在Ruby（1.9.2）中有两个来自两个不同来源（二进制数据）的长数据流。

这两个源以两个枚举器的forms封装。

我想检查两个流是否完全相同。

我有几个解决方案，但两者看起来都很不优雅。

第一个简单地将两者转换为数组：

def equal_streams?(s1, s2) s1.to_a == s2.to_a end

这是有效的，但它在内存方面并不是非常高效，特别是如果流有大量信息。

另一种选择是……呃。

 def equal_streams?(s1, s2) s1.each do |e1| begin e2 = s2.next return false unless e1 == e2 # Different element found rescue StopIteration return false # s2 has run out of items before s1 end end begin s2.next rescue StopIteration # s1 and s2 have run out of elements at the same time; they are equal return true end return false end

那么，有更简单，更优雅的方式吗？

假设您的流不包含元素:eof ，只需对您的代码进行轻微的重构。

 def equal_streams?(s1, s2) loop do e1 = s1.next rescue :eof e2 = s2.next rescue :eof return false unless e1 == e2 return true if e1 == :eof end end

使用类似loop的关键字应该比使用each类似的方法更快。

比较它们一次一个元素可能是你能做的最好的，但你可以做得比你的“呃”解决方案更好：

 def grab_next(h, k, s) begin h[k] = s.next rescue StopIteration end end def equal_streams?(s1, s2) loop do vals = { } grab_next(vals, :s1, s1) grab_next(vals, :s2, s2) return true if(vals.keys.length == 0) # Both of them ran out. return false if(vals.keys.length == 1) # One of them ran out early. return false if(vals[:s1] != vals[:s2]) # Found a mismatch. end end

棘手的部分是区分只有一个流用完和两个用完。将StopIterationexception推入单独的函数并使用哈希中缺少键是一种相当方便的方法。如果您的流包含false或nil只检查vals[:s1]将导致问题，但检查是否存在密钥可以解决该问题。

这是通过为Enumerable#zip创建一个替代方案来实现它，它可以懒惰地工作并且不会创建整个数组。它结合了我对Closure的interleave和其他两个答案的实现（使用sentinel值来表示已经达到了Enumerable结尾 – 导致问题的next事实是，一旦它到达终点就会将Enumerable倒带）。

此解决方案支持多个参数，因此您可以一次比较n个结构。

 module Enumerable # this should be just a unique sentinel value (any ideas for more elegant solution?) END_REACHED = Object.new def lazy_zip *others sources = ([self] + others).map(&:to_enum) Enumerator.new do |yielder| loop do sources, values = sources.map{|s| [s, s.next] rescue [nil, END_REACHED] }.transpose raise StopIteration if values.all?{|v| v == END_REACHED} yielder.yield values.map{|v| v == END_REACHED ? nil : v} end end end end

所以，当你的zip变体懒散地运行并且在第一个可枚举到达结尾时不会停止迭代时，你可以使用all? 或any? 实际检查相应的元素是否相等。

 # zip would fail here, as it would return just [[1,1],[2,2],[3,3]]: p [1,2,3].lazy_zip([1,2,3,4]).all?{|l,r| l == r} #=> false # this is ok p [1,2,3,4].lazy_zip([1,2,3,4]).all?{|l,r| l == r} #=> true # comparing more than two input streams: p [1,2,3,4].lazy_zip([1,2,3,4],[1,2,3]).all?{|vals| # check for equality by checking length of the uniqued array vals.uniq.length == 1 } #=> false

在评论中讨论之后，这里是基于zip的解决方案，首先在Enumerator包装zip版块，然后使用它来比较相应的元素。

它有效，但是已经提到了边缘情况：如果第一个流比另一个流短，则另一个流中的剩余元素将被丢弃（参见下面的示例）。

我已将此答案标记为社区维基，因为其他成员可以改进它。

 def zip_lazy *enums Enumerator.new do |yielder| head, *tail = enums head.zip(*tail) do |values| yielder.yield values end end end p zip_lazy(1..3, 1..4).all?{|l,r| l == r} #=> true p zip_lazy(1..3, 1..3).all?{|l,r| l == r} #=> true p zip_lazy(1..4, 1..3).all?{|l,r| l == r} #=> false

这是使用光纤/协同例程的双源示例。这有点啰嗦，但它的行为非常明确，这很好。

 def zip_verbose(enum1, enum2) e2_fiber = Fiber.new do enum2.each{|e2| Fiber.yield true, e2 } Fiber.yield false, nil end e2_has_value, e2_val = true, nil enum1.each do |e1_val| e2_has_value, e2_val = e2_fiber.resume if e2_has_value yield [true, e1_val], [e2_has_value, e2_val] end return unless e2_has_value loop do e2_has_value, e2_val = e2_fiber.resume break unless e2_has_value yield [false, nil], [e2_has_value, e2_val] end end def zip(enum1, enum2) zip_verbose(enum1, enum2) {|e1, e2| yield e1[1], e2[1] } end def self.equal?(enum1, enum2) zip_verbose(enum1, enum2) do |e1,e2| return false unless e1 == e2 end return true end

Ruby – 优雅地比较两个枚举器

未定义的方法`wikis_path’

在Y循环中创建一个for X将返回它包含的语句返回的内容

Ruby Rake从gem中加载任务

如何修改.xfdl文件？（更新＃1）

如何让paginate工作而不是重定向到’public / index.html’？

如何从rake文件中运行ruby类？

如何在Capybara和RSpec中测试CSV文件下载？

如何一次安装多颗ruby？

通过定义的数组元素的子字符串从定义的范围返回子数组

在`Dir.entries`中排序顺序

Ruby – 优雅地比较两个枚举器

未定义的方法`wikis_path’

在Y循环中创建一个for X将返回它包含的语句返回的内容

Ruby Rake从gem中加载任务

如何修改.xfdl文件？ （更新＃1）

如何让paginate工作而不是重定向到’public / index.html’？

如何从rake文件中运行ruby类？

如何在Capybara和RSpec中测试CSV文件下载？

如何一次安装多颗ruby？

通过定义的数组元素的子字符串从定义的范围返回子数组

在`Dir.entries`中排序顺序

如何修改.xfdl文件？（更新＃1）