Ruby中的数据转换

我在进行这些转换时遇到问题:

string = "test \\ud83d\\ude01" #into '1f601' and vise versa. unicode_value = 'U+1F601' #into string '\\ud83d\\ude01' 

我试过这种方法来编码

 string.encode('utf-8') #output is "test \\ud83d\\ude01" 

也试过这一个

 string.force_encoding('utf-8') #output is "test \\ud83d\\ude01" 

谢谢

hex到Unicode字符

“\ ud83d \ ude01”笑脸

根据此表 , "\\ud83d\\ude01"看起来像UTF-16 (hex) 。 请注意,它是标准的ASCII字符串: ["\\", "u", "d", "8", "3", "d", "\\", "u", "d", "e", "0", "1"]

 str = "\\ud83d\\ude01" hex = str.gsub("\\u",'') smiley = [hex].pack('H*').force_encoding('utf-16be').encode('utf-8') puts smiley #=> 😁 

‘U + 1F601’笑脸

这看起来像hex的’UTF-8’字符。 请注意, "U+1F601"也是标准ASCII字符串: ["U", "+", "1", "F", "6", "0", "1"]

 unicode_value = 'U+1F601' hex = unicode_value.sub('U+','') smiley = hex.to_i(16).chr('UTF-8') puts smiley #=> 😁 

UTF-8Hex⟷UTF-16 Hex

结合以上两种方法:

“\ ud83d \ ude01”改为“U + 1F601”

 str = "\\ud83d\\ude01" utf16_hex = str.gsub("\\u",'') smiley = [utf16_hex].pack('H*').force_encoding('utf-16be').encode('utf-8') utf8_hex = smiley.ord.to_s(16).upcase new_str = "U+#{utf8_hex}" puts new_str #=> "U+1F601" 

‘U + 1F601’到“\ ud83d \ ude01”

 unicode_value = 'U+1F601' hex = unicode_value.sub('U+','') smiley = hex.to_i(16).chr('UTF-8') puts smiley.force_encoding('utf-8').encode('utf-16be').unpack('H*').first.gsub(/(....)/,'\u\1') #=> "\ud83d\ude01" 

可能有一种更简单的方法可以做到这一点,但我找不到它。

使用此代码

 def utf16_hex_to_unicode_char(utf16_hex) hex = utf16_hex.gsub("\\u",'') [hex].pack('H*').force_encoding('utf-16be').encode('utf-8') end def replace_all_utf16_hex(string) string.gsub(/(\\u[0-9a-fA-F]{4}){2}/){|hex| utf16_hex_to_unicode_char(hex)} end puts replace_all_utf16_hex("Hello \\ud83d\\ude01, I just bought a \\uD83D\\uDC39") #=> "Hello 😁, I just bought a 🐹"