ConsistentHash
A simple Ruby library that gives String
the method #consistent_hash
, which
will return consistent values, even between different instances of MRI.
This entire gem is basically just a dozen lines of C. The code is written as a C
extension so that it can perform as well as String#hash
. (See the section on
performance.)
String#consistent_hash
and Java's String#hashCode
methods do exactly the
same thing, and return the same results. This library is not re-inventing the
string hash function, it's just implementing an alternative to Ruby's built-in
#hash
.
Installation
Add this line to your application's Gemfile:
gem 'consistent_hash'
And then execute:
$ bundle
Or install it yourself as:
$ gem install consistent_hash
Motivation
Every time you restart Ruby, the values returned from String#hash
are changed:
$ irb
2.0.0p353 :001 > "hello".hash
=> 482951767139383391
2.0.0p353 :002 > exit
$ irb
2.0.0p353 :001 > "hello".hash
=> 3216751850140847920
2.0.0p353 :002 > exit
This means you cannot rely on the value of String#hash
to precalculate some
data about a string and store the value of #hash
in a database, because if you
restart your Ruby program, the values of String#hash
will be completely
changed.
But with #consistent_hash
, this is no longer the case:
$ irb
2.0.0-p353 :001 > require 'consistent_hash'
=> true
2.0.0-p353 :002 > "hello".consistent_hash
=> 99162322
2.0.0-p353 :003 > exit
$ irb
2.0.0-p353 :001 > require 'consistent_hash'
=> true
2.0.0-p353 :002 > "hello".consistent_hash
=> 99162322
2.0.0-p353 :003 > exit
Performance
If you're concerned about how fast your hash function is, #consistent_hash
will work for you. Because this library is written in C, it's almost as fast as
the built-in hash function, and it's a few orders of magnitude faster than any
Ruby code:
user system total real
naive: 35.620000 0.220000 35.840000 ( 35.841524) # this is written in Ruby
ruby: 30.460000 0.180000 30.640000 ( 30.644228) # this is written in Ruby
c: 0.010000 0.000000 0.010000 ( 0.005697) # this is #consistent_hash
std hash: 0.000000 0.000000 0.000000 ( 0.003163) # this is String#hash
(See benchmark.rb)
Contributing
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request