比较哈希数组并打印预期和实际结果
我有2个哈希数组:
actual = [{"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"}, {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"}, {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"}, {"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"}] expected = [{"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"}, {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"NUMBER"}, {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"NUMBER"}, {"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"}]
我需要比较这两个哈希值,找出column_data_type
不同的哈希值。
比较我们可以直接使用:
diff = actual - expected
这会将输出打印为:
{"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"} {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"}
我的预期输出是在结果中我想打印实际和预期的数据类型,意味着实际和预期的哈希数组中缺少的`column_name’的数据类型,如:
{"column_name"=>"NONINTERESTEXPENSE", "expected_column_data_type"=>"NUMBER", "actual_column_data_type" => "VARCHAR"} {"column_name"=>"TRANSACTIONDATE", "expected_column_data_type"=>"NUMBER","actual_column_data_type" => "TIMESTAMP" }
(expected - actual). concat(actual - expected). group_by { |column| column['column_name'] }. map do |name, (expected, actual)| { 'column_name' => name, 'expected_column_data_type' => expected['column_data_type'], 'actual_column_data_type' => actual['column_data_type'], } end
无论数组中的哈希顺序如何,这都将起作用。
diff = [] expected.each do |elem| column_name = elem['column_name'] column_type = elem['column_data_type'] match = actual.detect { |elem2| elem2['column_name'] == column_name } if column_type != match['column_data_type'] diff << { 'column_name' => column_name, 'expected_column_data_type' => column_type, 'actual_column_data_type' => match['column_data_type'] } end end p diff
[actual, expected].map { |a| a.map(&:dup).map(&:values) } .map(&Hash.method(:[])) .reduce do |actual, expected| actual.merge(expected) do |k, o, n| o == n ? nil : {name: k, actual: o, expected: n} end end.values.compact #⇒ [ # [0] { # :name => "NONINTERESTEXPENSE", # :actual => "VARCHAR", # :expected => "NUMBER" # }, # [1] { # :name => "TRANSACTIONDATE", # :actual => "TIMESTAMP", # :expected => "NUMBER" # } # ]
上面的方法可以轻松扩展以合并N个数组(使用reduce.with_index
并与键"value_from_#{idx}"
merge
。)
那这个呢?
def select(hashes_array, column_name) hashes_array.select { |h| h["column_name"] == column_name }.first end diff = (expected - actual).map do |h| { "column_name" => h["column_name"], "expected_column_data_type" => select(expected, h["column_name"])["column_data_type"], "actual_column_data_type" => select(actual, h["column_name"])["column_data_type"], } end
PS:肯定这个代码可以改进看起来更优雅
码
def convert(actual, expected) hashify(actual-expected, "actual_data_type"). merge(hashify(expected-actual, "expected_data_type")) { |_,a,e| a.merge(e) }.values end def hashify(arr, key) arr.each_with_object({}) { |g,h| h[g["column_name"]] = { "column_name"=>g["column_name"], key=>g["column_data_type"] } } end
例
actual = [ {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"}, {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"}, {"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"}, {"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"} ] expected = [ {"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"}, {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"NUMBER"}, {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"NUMBER"}, {"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"} ] convert(actual, expected) #=> [{"column_name"=>"TRANSACTIONDATE", # "actual_data_type"=>"TIMESTAMP", "expected_data_type"=>"NUMBER"}, # {"column_name"=>"NONINTERESTEXPENSE", # "actual_data_type"=>"VARCHAR", "expected_data_type"=>"NUMBER"}]
说明
对于上面的例子,步骤如下。
首先hashify
actual
和expected
。
f = actual-expected #=> [{"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"}, # {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"}] g = hashify(f, "actual_data_type") #=> {"TRANSACTIONDATE"=>{"column_name"=>"TRANSACTIONDATE", # "actual_data_type"=>"TIMESTAMP"}, # "NONINTERESTEXPENSE"=>{ "column_name"=>"NONINTERESTEXPENSE", # "actual_data_type"=>"VARCHAR"}} h = expected-actual #=> [{"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"NUMBER"}, # {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"NUMBER"}] i = hashify(h, "expected_data_type") #=> {"NONINTERESTEXPENSE"=>{"column_name"=>"NONINTERESTEXPENSE", # "expected_data_type"=>"NUMBER"}, # "TRANSACTIONDATE"=>{"column_name"=>"TRANSACTIONDATE", # "expected_data_type"=>"NUMBER"}}
接下来使用Hash#merge的forms合并g
和i
,它使用块来确定合并的两个散列中存在的键的值。 请参阅doc以了解三个块变量的定义(第一个是公共密钥,我用下划线表示它表示它不用于块计算)。
j = g.merge(i) { |_,a,e| a.merge(e) } #=> {"TRANSACTIONDATE"=>{"column_name"=>"TRANSACTIONDATE", # "actual_data_type"=>"TIMESTAMP", "expected_data_type"=>"NUMBER"}, # "NONINTERESTEXPENSE"=>{"column_name"=>"NONINTERESTEXPENSE", # "actual_data_type"=>"VARCHAR", "expected_data_type"=>"NUMBER"}}
最后,放下钥匙。
k = j.values #=> [{"column_name"=>"TRANSACTIONDATE", "actual_data_type"=>"TIMESTAMP", # "expected_data_type"=>"NUMBER"}, # {"column_name"=>"NONINTERESTEXPENSE", "actual_data_type"=>"VARCHAR", # "expected_data_type"=>"NUMBER"}]