剥去特定控制字符的ruby字符串
这很简单:如何删除特殊字符的ruby字符串? 这是字符: http : //www.fileformat.info/info/unicode/char/2028/index.htm
这是字符串,句点和结尾引号之间有两个特殊字符:
"Each of the levels requires logic, skill, and brute force to crush the enemy. "
我没试成功:
string.gsub!(/[\x00-\x1F\x7F]/, '')
和gsub("/\n/", "")
我正在使用ruby 1.9.3p125
String#gsub
可以工作,但是比String#tr更通用,效率更低
irb> s ="Hello,\u2028 World; here's some ctrl [\1\2\3\4\5\6] chars" => "Hello,\u2028 World; here's some ctrl [\u0001\u0002\u0003\u0004\u0005\u0006] chars" irb> s.tr("\u0000-\u001f\u007f\u2028",'') => "Hello, World; here's some ctrl [] chars" require 'benchmark' Benchmark.bm {|x| x.report('tr') { 1_000_000.times{ s.tr("\u0000-\u001f\u007f\u2028",'') } } x.report('gsub') { 1_000_000.times{ s.gsub(/[\0-\x1f\x7f\u2028]/,'') } } } user system total real tr 1.440000 0.000000 1.440000 ( 1.448090) gsub 4.110000 0.000000 4.110000 ( 4.127100)
我想到了! .gsub(/\u2028/, '')