从重复的数组元素中删除

从重复的数组元素中删除的最佳方法是什么。例如，从数组中

a = [4, 3, 3, 1, 6, 6]

需要得到

 a = [4, 1]

我的方法适用于大量元素的过慢。

 arr = [4, 3, 3, 1, 6, 6] puts arr.join(" ") nouniq = [] l = arr.length uniq = nil for i in 0..(l-1) for j in 0..(l-1) if (arr[j] == arr[i]) and ( i != j ) nouniq << arr[j] end end end arr = (arr - nouniq).compact puts arr.join(" ")

 a = [4, 3, 3, 1, 6, 6] a.select{|b| a.count(b) == 1} #=> [4, 1]

更复杂但更快的解决方案（ O(n)我相信:)）

 a = [4, 3, 3, 1, 6, 6] ar = [] add = proc{|to, form| to << from[1] if form.uniq.size == from.size } a.sort!.each_cons(3){|b| add.call(ar, b)} ar << a[0] if a[0] != a[1]; ar << a[-1] if a[-1] != a[-2]

 arr = [4, 3, 3, 1, 6, 6] arr. group_by {|e| e }. map {|e, es| [e, es.length] }. reject {|e, count| count > 1 }. map(&:first) # [4, 1]

无需引入原始数组的单独副本并使用注入：

 [4, 3, 3, 1, 6, 6].inject({}) {|s,v| s[v] ? s.merge({v=>s[v]+1}) : s.merge({v=>1})}.select {|k,v| k if v==1}.keys => [4, 1]

我需要这样的东西，所以测试了一些不同的方法。这些都返回原始数组中重复的项目数组：

 module Enumerable def dups inject({}) {|h,v| h[v]=h[v].to_i+1; h}.reject{|k,v| v==1}.keys end def only_duplicates duplicates = [] self.each {|each| duplicates << each if self.count(each) > 1} duplicates.uniq end def dups_ej inject(Hash.new(0)) {|h,v| h[v] += 1; h}.reject{|k,v| v==1}.keys end def dedup duplicates = self.dup self.uniq.each { |v| duplicates[self.index(v)] = nil } duplicates.compact.uniq end end

Benchark结果为100,000次迭代，首先是一个整数数组，然后是一个字符串数组。性能将根据找到的重复项的数量而有所不同，但这些测试具有固定数量的重复项（〜半数组条目是重复的）：

 test_benchmark_integer user system total real Enumerable.dups 2.560000 0.040000 2.600000 ( 2.596083) Enumerable.only_duplicates 6.840000 0.020000 6.860000 ( 6.879830) Enumerable.dups_ej 2.300000 0.030000 2.330000 ( 2.329113) Enumerable.dedup 1.700000 0.020000 1.720000 ( 1.724220) test_benchmark_strings user system total real Enumerable.dups 4.650000 0.030000 4.680000 ( 4.722301) Enumerable.only_duplicates 47.060000 0.150000 47.210000 ( 47.478509) Enumerable.dups_ej 4.060000 0.030000 4.090000 ( 4.123402) Enumerable.dedup 3.290000 0.040000 3.330000 ( 3.334401) .. Finished in 73.190988 seconds.

在这些方法中，似乎Enumerable.dedup算法是最好的：

复制原始数组，因此它是不可变的
获取uniq数组元素
对于每个唯一元素：nil第一次出现在dup数组中
压缩结果

如果只有（array – array.uniq）正常工作！（它不会 – 它会删除所有内容）

这是我对Perl程序员使用哈希来累积数组中每个元素的计数的解决方案的调整：

 ary = [4, 3, 3, 1, 6, 6] ary.inject({}) { |h,a| h[a] ||= 0 h[a] += 1 h }.select{ |k,v| v == 1 }.keys # => [4, 1]

它可以在一条线上，如果这一点很重要，可以通过在map的线之间明智地使用分号来实现。

一个不同的方式是：

 ary.inject({}) { |h,a| h[a] ||= 0; h[a] += 1; h }.map{ |k,v| k if (v==1) }.compact # => [4, 1]

它用map{...}.compact替换了select{...}.keys map{...}.compact所以它不是一个改进，对我来说有点难以理解。

从重复的数组元素中删除

Ruby可以轻松搜索散列数组中的键值对

Ruby方法，用于对多维数组中的所有值求和

使用`Array.new（n，Array.new）`创建矩阵

Ruby on Rails使用filter值数组进行最佳搜索？

检测Ruby中的重叠范围

如何在Ruby数组的所有元素之间插入一个新元素？

在Ruby中将数组拆分为多个小数组的最佳方法

按二次值排序二维数组

清除数组中的空字符串

arrays行为不端