Ruby组按键值散列

我有一个数组,由MongoDB执行的map / reduce方法输出,它看起来像这样:

[{"minute"=>30.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>0.0, "count"=>299.0}, {"minute"=>30.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>10.0, "count"=>244.0}, {"minute"=>30.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>1.0, "count"=>204.0}, {"minute"=>45.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>0.0, "count"=>510.0}, {"minute"=>45.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>10.0, "count"=>437.0}, {"minute"=>0.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>0.0, "count"=>469.0}, {"minute"=>0.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>10.0, "count"=>477.0}, {"minute"=>15.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>0.0, "count"=>481.0}, {"minute"=>15.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>10.0, "count"=>401.0}, {"minute"=>30.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>0.0, "count"=>468.0}, {"minute"=>30.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>10.0, "count"=>448.0}, {"minute"=>45.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>0.0, "count"=>485.0}, {"minute"=>45.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>10.0, "count"=>518.0}] 

您会注意到type有三个不同的值,在这种情况下为02 ,现在要做的是将这个散列数组按其type键的值进行分组,所以例如这个数组最终看起来像:

 { :type_0 => [ {"minute"=>30.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>299.0}, {"minute"=>45.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>510.0}, {"minute"=>0.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>469.0}, {"minute"=>15.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>481.0}, {"minute"=>30.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>468.0}, {"minute"=>45.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>485.0} ], :type_1 => [ {"minute"=>30.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>204.0} ], :type_10 => [ {"minute"=>30.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>244.0}, {"minute"=>45.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>437.0}, {"minute"=>0.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>477.0}, {"minute"=>15.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>401.0}, {"minute"=>30.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>448.0}, {"minute"=>45.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>518.0} ] } 

所以我知道这些示例数组真的很大,但我认为这可能是一个比我正在做的更简单的问题

所以基本上每个哈希数组都会按其type键的值进行分组,然后作为哈希返回,每个类型都有一个数组,任何帮助都非常有用,即使只是一些有用的提示也会非常感激。

 array.group_by {|x| x['type']} 

或者如果你想要符号关键的东西你甚至可以

 array.group_by {|x| "type_#{x['type']}".to_sym} 

我认为这最好表达“所以基本上每个哈希数组都会按其类型键的值进行分组 ,然后作为哈希返回,每个类型都有一个数组 ”,即使它只在输出哈希中留下:type键。

也许这样的事情?

 mangled = a.group_by { |h| h['type'].to_i }.each_with_object({ }) do |(k,v), memo| tk = ('type_' + k.to_s).to_sym memo[tk] = v.map { |h| h = h.dup; h.delete('type'); h } end 

或者,如果您不关心保留原始数据:

 mangled = a.group_by { |h| h['type'].to_i }.each_with_object({ }) do |(k,v), memo| tk = ('type_' + k.to_s).to_sym memo[tk] = v.map { |h| h.delete('type'); h } # Drop the h.dup in here end 
 by_type = {} a.each do |h| type = h.delete("type").to_s # type = ("type_" + type ).to_sym by_type[ type ] ||= [] by_type[ type ] << h # note: h is modified, without "type" key end 

注意:这里的哈希键略有不同,我直接使用类型值作为键

如果您必须在示例中使用散列键,则可以添加已注释掉的行。


PS:我刚看到Tapio的解决方案 - 它非常好而且简短! 请注意,它仅适用于Ruby> = 1.9

group_by 将可枚举成集合,按块的结果分组 。 您不必仅仅在此块中获取键的值,因此如果您想在这些集中省略'type' ,则可以执行此操作,如:

 array.group_by {|x| "type_#{x.delete('type').to_i}".to_sym} 

这将完全符合您的要求。

高级:这有点超出了问题的范围,但是如果要保留原始数组,则必须复制其中的每个对象。 这样就可以了:

 array.map(&:dup).group_by {|x| "type_#{x.delete('type').to_i}".to_sym}