按第一个值对CSV文件的行进行排序

股票data.csv

1425904377,22532.1309,22533.6992,22524.0703,22524.0703,0 1425904438,22533.4395,22533.4395,22529.2793,22532.2207,0 1425904499,22531.8809,22536.0801,22531.8809,22533.2793,0 1425904559,22532.4297,22534.7207,22530.7598,22532.0996,0 1425904618,22535.7695,22535.9297,22530.6094,22532.2500,0 1425904679,22536.0703,22539.2598,22535.5605,22535.6094,0 1425904738,22542.8809,22544.2305,22536.3594,22536.3594,0 1425904797,22540.6504,22544.0391,22538.5000,22542.9707,0 1425904857,22545.2891,22552.5098,22538.5898,22538.9004,0 1425904860,22547.0703,22547.0703,22547.0703,22547.0703,0 

我必须做两件事:

  1. timestamp排序行(第一个值)
  2. 删除重复(键: timestamp

我尝试过这个,但它不起作用:

 my_csv = CSV.read("public/#{symbol}.csv") my_csv.sort_by!(&:first) open("public/#{symbol}.csv", 'w') { |f| my_csv.each do |row| f.puts "#{row.join(",")}" end } 

 require 'csv' my_csv = CSV.read 'data.csv' my_csv.sort! { |a, b| a[0].to_i <=> b[0].to_i } my_csv.uniq!(&:first) my_csv.each { |line| p line } 

输出:

 ["1425904377", "22532.1309", "22533.6992", "22524.0703", "22524.0703", "0"] ["1425904438", "22533.4395", "22533.4395", "22529.2793", "22532.2207", "0"] ["1425904499", "22531.8809", "22536.0801", "22531.8809", "22533.2793", "0"] ["1425904559", "22532.4297", "22534.7207", "22530.7598", "22532.0996", "0"] ["1425904618", "22535.7695", "22535.9297", "22530.6094", "22532.2500", "0"] ["1425904679", "22536.0703", "22539.2598", "22535.5605", "22535.6094", "0"] ["1425904738", "22542.8809", "22544.2305", "22536.3594", "22536.3594", "0"] ["1425904797", "22540.6504", "22544.0391", "22538.5000", "22542.9707", "0"] ["1425904857", "22545.2891", "22552.5098", "22538.5898", "22538.9004", "0"] ["1425904860", "22547.0703", "22547.0703", "22547.0703", "22547.0703", "0"] 
  • CSV.read将文件读取为数组
  • 我们可以使用sort ,在回调中我们得到2个项目进行比较
  • 我们比较第一项(时间戳),我们将其转换为整数,以确保我们得到一个整数比较。
  • 我们使用“宇宙飞船运营商”进行比较; 它将返回0,1或-1(参见链接),这是要求块返回的sort
  • 我们使用uniq! 要就地修改数组,请在第一个条目中使用它。