Performance of Arrays and Hashes in Ruby Performance of Arrays and Hashes in Ruby ruby ruby

Performance of Arrays and Hashes in Ruby


Hashes are much faster for lookups:

require 'benchmark'Document = Struct.new(:id,:a,:b,:c)documents_a = []documents_h = {}1.upto(10_000) do |n|  d = Document.new(n)  documents_a << d  documents_h[d.id] = dendsearchlist = Array.new(1000){ rand(10_000)+1 }Benchmark.bm(10) do |x|  x.report('array'){searchlist.each{|el| documents_a.any?{|d| d.id == el}} }  x.report('hash'){searchlist.each{|el| documents_h.has_key?(el)} }end#                user     system      total        real#array       2.240000   0.020000   2.260000 (  2.370452)#hash        0.000000   0.000000   0.000000 (  0.000695)


When using unique values, you can use the Ruby Set which has been previously mentioned. Here are benchmark results. It's slightly slower than the hash though.

                 user     system      total        realarray        0.460000   0.000000   0.460000 (  0.460666)hash         0.000000   0.000000   0.000000 (  0.000219)set          0.000000   0.000000   0.000000 (  0.000273)

I simply added to @steenslag's code which can be found here https://gist.github.com/rsiddle/a87df54191b6b9dfe7c9.

I used ruby 2.1.1p76 for this test.


Ruby has a set class in its standard library, have you considering keeping an (additional) set of IDs only?

http://stdlib.rubyonrails.org/libdoc/set/rdoc/index.html

To quote the docs: "This is a hybrid of Array’s intuitive inter-operation facilities and Hash’s fast lookup".