在Ruby中获取用于哈希中的键的特定对象的便捷方式?

这是一个有趣的一个,我在一个桶分片系统中有一个场景,我正在编写我维护索引哈希和存储哈希的地方,相互关系是一个UUID生成,因为这是分布式的,我希望有一些信心,新的桶获得独特的引用。

在本练习的早期,我开始优化代码以冻结SecureRandom.uuid生成的所有密钥(它会生成字符串),因为当您使用字符串作为哈希中的密钥时,会被欺骗并自动冻结以确保无法更改。 (如果它是一个字符串而不是冻结的)。

在大多数情况下,很容易积极地执行此操作,特别是对于新的UUID(实际上在我的项目中,许多此类值需要这种处理)但在某些情况下,我发现我必须接近通过网络传递的值的哈希值然后获取,为了确保一致地使用任何作为键的字符串,使用相当钝的查找机制。

我的目标是,因为我希望这能够在多个节点上维护一个庞大的数据集,以尽可能地减少密钥和索引存储的开销,并且因为它是一个存储系统,所以可以多次引用相同的UUID,因此它是有助于使用相同的参考。

这里有一些代码以简单(ish)forms演示了这个问题。 我只想询问是否有更优化和方便的机制来获取具有相同字符串值的键的任何预先存在的对象引用(对于键名而不是关联的值)。

# Demonstrate the issue.. require 'securerandom' index = Hash.new store = Hash.new key = 'meh' value = 1 uuid = SecureRandom.uuid puts "Ruby dups and freezes strings if used for keys in hashes" puts "This produces different IDs" store[uuid] = value index[key] = uuid store.each_key { |x| puts "Store reference for value of #{x} #{x.object_id}"} index.each_value { |x| puts "Index reference for #{x} #{x.object_id}" } puts puts "If inconsistencies in ID occur then Ruby attempts to preserve the use of the frozen key so if it happens in one area take care" puts "This produces different IDs" uuid = uuid.freeze store[uuid] = value index[key] = uuid store.each_key { |x| puts "Store reference for value of #{x} #{x.object_id}"} index.each_value { |x| puts "Index reference for #{x} #{x.object_id}" } puts puts "If you start with a clean slate and a frozen key you can overcome it if you freeze the string before use" puts "This is clean so far and produces the same object" index = Hash.new store = Hash.new store[uuid] = value index[key] = uuid store.each_key { |x| puts "Store reference for value of #{x} #{x.object_id}"} index.each_value { |x| puts "Index reference for #{x} #{x.object_id}" } puts puts "But if the same value for the key comes in (possibly remote) then it becomes awkward" puts "This produces different IDs" uuid = uuid.dup.freeze store[uuid] = value index[key] = uuid store.each_key { |x| puts "Store reference for value of #{x} #{x.object_id}"} index.each_value { |x| puts "Index reference for #{x} #{x.object_id}" } puts puts "So you get into oddities like this to ensure you standarise values put in to keys that already exist" puts "This cleans up and produces same IDs but is a little awkward" uuid = uuid.dup.freeze uuid_list = store.keys uuid = uuid_list[uuid_list.index(uuid)] if uuid_list.include?(uuid) store[uuid] = value index[key] = uuid store.each_key { |x| puts "Store reference for value of #{x} #{x.object_id}"} index.each_value { |x| puts "Index reference for #{x} #{x.object_id}" } puts 

示例运行…

 Ruby dups and freezes strings if used for keys in hashes This produces different IDs Store reference for value of bd48a581-95e9-452e-b8a3-602d92d47011 70209306325780 Index reference for bd48a581-95e9-452e-b8a3-602d92d47011 70209306325880 If inconsistencies in ID occur then Ruby attempts to preserve the use of the frozen key so if it happens in one area take care This produces different IDs Store reference for value of bd48a581-95e9-452e-b8a3-602d92d47011 70209306325780 Index reference for bd48a581-95e9-452e-b8a3-602d92d47011 70209306325880 If you start with a clean slate and a frozen key you can overcome it if you freeze the string before use This is clean so far and produces the same object Store reference for value of bd48a581-95e9-452e-b8a3-602d92d47011 70209306325880 Index reference for bd48a581-95e9-452e-b8a3-602d92d47011 70209306325880 But if the same value for the key comes in (possibly remote) then it becomes awkward This produces different IDs Store reference for value of bd48a581-95e9-452e-b8a3-602d92d47011 70209306325880 Index reference for bd48a581-95e9-452e-b8a3-602d92d47011 70209306325000 So you get into oddities like this to ensure you standarise values put in to keys that already exist This cleans up and produces same IDs but is a little awkward Store reference for value of bd48a581-95e9-452e-b8a3-602d92d47011 70209306325880 Index reference for bd48a581-95e9-452e-b8a3-602d92d47011 70209306325880 

看来,对于纯Ruby示例,由于符号对象引用的全局特性,可以完全避免这种情况。 将字符串转换为符号以确保相同的引用就足够了。 这不是我所希望的,因为我有时会使用Ruby来为C开发人员制作原型,但它可靠地工作,并且适合帮助我的原型进展,并为C开发阶段提供大量额外的评论。

我仍然会对其他示例感兴趣,但这里有一些关于Symbols的大拇指,虽然我倾向于在许多网络案例中避免它们,因为它们通过JSON编组到String(我喜欢JSON,因为用不同语言编写的同行通常可以支持它)。

 imac:Ruby andrews$ irb irb(main):001:0> a = :meh => :meh irb(main):002:0> b = 'meh'.to_sym => :meh irb(main):003:0> a.object_id == b.object_id => true 

此方法的附加备份为什么在Ruby中使用符号作为哈希键?

此外,需要记住,一旦命名,符号不会被垃圾收集。

也许你正在寻找Enumerable#find

 uuid = store.find{|k,_| k == uuid_from_network }.first 

完整示例:

 require 'securerandom' index = Hash.new store = Hash.new key = 'meh' value = 1 uuid = SecureRandom.uuid store[uuid] = value index[key] = uuid # obtained from elsewhere uuid = uuid.dup.freeze uuid = store.find{|k,_| k == uuid }.first store[uuid] = value index[key] = uuid store.each_key { |x| puts "Store reference for value of #{x} #{x.object_id}"} index.each_value { |x| puts "Index reference for #{x} #{x.object_id}" } 

输出:

 Store reference for value of d94390c4-7cc7-4e94-92bc-a0dd862ac6a2 70190385847520 Index reference for d94390c4-7cc7-4e94-92bc-a0dd862ac6a2 70190385847520 

如果你想变得高效,你可以围绕C函数st_get_key构建一个轻量级的包装器,它可以完全满足您的需求。 我接受了Hash#has_key?的实现Hash#has_key? 作为样板。 您可以将C代码混合到Ruby代码中,例如使用RubyInline 。

 require 'inline' class Hash inline do |builder| builder.c <<-EOS VALUE fetch_key(VALUE key) { st_data_t result; if (!RHASH(self)->ntbl) return Qnil; if (st_get_key(RHASH(self)->ntbl, key, &result)) { return result; } return Qnil; } EOS end end 

我在Hash源代码中找不到任何原生内容,符号不适合我的目的所以我调整了@ p11y的答案,谢谢^^

 class Hash def consistent_key_obj(key) self.keys.include?(key) ? self.find{|k,_| k == key }.first : key end end