如何压缩哈希,使每个键成为唯一值?

我想采用嵌套哈希和数组的哈希,并将其展平为具有唯一值的单个哈希。 我一直试图从不同的角度来解决这个问题,但随后我会让它变得比它需要的更复杂,让自己迷失在正在发生的事情中。

示例源哈希:

{ "Name" => "Kim Kones", "License Number" => "54321", "Details" => { "Name" => "Kones, Kim", "Licenses" => [ { "License Type" => "PT", "License Number" => "54321" }, { "License Type" => "Temp", "License Number" => "T123" }, { "License Type" => "AP", "License Number" => "A666", "Expiration Date" => "12/31/2020" } ] } } 

示例所需哈希:

 { "Name" => "Kim Kones", "License Number" => "54321", "Details_Name" => "Kones, Kim", "Details_Licenses_1_License Type" => "PT", "Details_Licenses_1_License Number" => "54321", "Details_Licenses_2_License Type" => "Temp", "Details_Licenses_2_License Number" => "T123", "Details_Licenses_3_License Type" => "AP", "Details_Licenses_3_License Number" => "A666", "Details_Licenses_3_Expiration Date" => "12/31/2020" } 

对于它的价值,这是我最近的尝试,然后才放弃。

 def flattify(hashy) temp = {} hashy.each do |key, val| if val.is_a? String temp["#{key}"] = val elsif val.is_a? Hash temp.merge(rename val, key, "") elsif val.is_a? Array temp["#{key}"] = enumerate val, key else end print "=> #{temp}\n" end return temp end def rename (hashy, str, n) temp = {} hashy.each do |key, val| if val.is_a? String temp["#{key}#{n}"] = val elsif val.is_a? Hash val.each do |k, v| temp["#{key}_#{k}#{n}"] = v end elsif val.is_a? Array temp["#{key}"] = enumerate val, key else end end return flattify temp end def enumerate (ary, str) temp = {} i = 1 ary.each do |x| temp["#{str}#{i}"] = x i += 1 end return flattify temp end 

有趣的问题!

理论

这是一个解析数据的递归方法。

  • 它会跟踪它找到的键和索引。
  • 它将它们附加到tmp数组中。
  • 一旦找到了叶子对象,它就会以哈希值的forms写入,并将连接的tmp作为键。
  • 然后,这个小哈希以递归方式合并回主哈希。

 def recursive_parsing(object, tmp = []) case object when Array object.each.with_index(1).with_object({}) do |(element, i), result| result.merge! recursive_parsing(element, tmp + [i]) end when Hash object.each_with_object({}) do |(key, value), result| result.merge! recursive_parsing(value, tmp + [key]) end else { tmp.join('_') => object } end end 

举个例子:

 require 'pp' pp recursive_parsing(data) # {"Name"=>"Kim Kones", # "License Number"=>"54321", # "Details_Name"=>"Kones, Kim", # "Details_Licenses_1_License Type"=>"PT", # "Details_Licenses_1_License Number"=>"54321", # "Details_Licenses_2_License Type"=>"Temp", # "Details_Licenses_2_License Number"=>"T123", # "Details_Licenses_3_License Type"=>"AP", # "Details_Licenses_3_License Number"=>"A666", # "Details_Licenses_3_Expiration Date"=>"12/31/2020"} 

调试

这是一个带有旧式调试的修改版本。 它可能会帮助您了解正在发生的事情:

 def recursive_parsing(object, tmp = [], indent="") puts "#{indent}Parsing #{object.inspect}, with tmp=#{tmp.inspect}" result = case object when Array puts "#{indent} It's an array! Let's parse every element:" object.each_with_object({}).with_index(1) do |(element, result), i| result.merge! recursive_parsing(element, tmp + [i], indent + " ") end when Hash puts "#{indent} It's a hash! Let's parse every key,value pair:" object.each_with_object({}) do |(key, value), result| result.merge! recursive_parsing(value, tmp + [key], indent + " ") end else puts "#{indent} It's a leaf! Let's return a hash" { tmp.join('_') => object } end puts "#{indent} Returning #{result.inspect}\n" result end 

当使用recursive_parsing([{a: 'foo', b: 'bar'}, {c: 'baz'}])调用时,它会显示:

 Parsing [{:a=>"foo", :b=>"bar"}, {:c=>"baz"}], with tmp=[] It's an array! Let's parse every element: Parsing {:a=>"foo", :b=>"bar"}, with tmp=[1] It's a hash! Let's parse every key,value pair: Parsing "foo", with tmp=[1, :a] It's a leaf! Let's return a hash Returning {"1_a"=>"foo"} Parsing "bar", with tmp=[1, :b] It's a leaf! Let's return a hash Returning {"1_b"=>"bar"} Returning {"1_a"=>"foo", "1_b"=>"bar"} Parsing {:c=>"baz"}, with tmp=[2] It's a hash! Let's parse every key,value pair: Parsing "baz", with tmp=[2, :c] It's a leaf! Let's return a hash Returning {"2_c"=>"baz"} Returning {"2_c"=>"baz"} Returning {"1_a"=>"foo", "1_b"=>"bar", "2_c"=>"baz"} 

与其他人不同,我喜欢each_with_object :-)。 但我确实喜欢传递一个结果哈希,所以我不必一次又一次地合并和重新散列哈希。

 def flattify(value, result = {}, path = []) case value when Array value.each.with_index(1) do |v, i| flattify(v, result, path + [i]) end when Hash value.each do |k, v| flattify(v, result, path + [k]) end else result[path.join("_")] = value end result end 

(Eric收集的一些细节,见评论)

非递归方法,使用带有数组作为队列的BFS。 我保留键值对,其中值不是数组/散列,并将数组/散列内容推送到队列(使用组合键)。 将数组转换为哈希值( ["a", "b"]{1=>"a", 2=>"b"} ),因为它感觉很整洁。

 def flattify(hash) (q = hash.to_a).select { |key, value| value = (1..value.size).zip(value).to_h if value.is_a? Array !value.is_a?(Hash) || !value.each { |k, v| q << ["#{key}_#{k}", v] } }.to_h end 

我喜欢它的一件事是将键组合为"#{key}_#{k}" 。 在我的另一个解决方案中,我也可以使用字符串path = ''并使用path + "_" + k扩展,但这会导致我必须避免或使用额外代码修剪的前导下划线。