Leveldb
LevelDB is a database library (C++, 350 kB) written at Google. It is an embedded database. LevelDB is a persistent ordered map.
LevelDB stores keys and values in arbitrary byte arrays, and data is sorted by key. It supports batching writes, forward and backward iteration, and compression of the data via Google's Snappy compression library. Still, LevelDB is not a SQL database. (Wikipedia)
Features
- Keys and values are arbitrary byte arrays.
- Data is stored sorted by key.
- Callers can provide a (soon) custom comparison function to override the sort order.
- The basic operations are Put(key,value), Get(key), Delete(key).
- Multiple changes can be made in one atomic batch.
- Users can create a transient snapshot to get a consistent view of data.
- Forward and backward iteration is supported over the data.
- Data is automatically compressed using the Snappy compression library.
- External activity (file system operations etc.) is relayed through a virtual interface so users can customize the operating system interactions.
- Detailed documentation about how to use the library is included with the source code.
Reading
Installation
Development
$ brew install snappy
$ git clone git://github.com/DAddYE/leveldb.git
$ cd leveldb
$ bundle install
$ bundle exec rake compile
$ bundle exec rake console
Standard
$ brew install snappy
$ gem install leveldb
$ irb -r leveldb
Usage
Here a basic usage:
db = LevelDB::DB.new '/tmp/foo'
# Writing
db.put('hello', 'world')
db['hello'] = 'world'
# Reading
db.get('hello') # => world
db['hello'] # => world
db.exists?('hello') # => true
# Reading/Writing
db.fetch('hello', 'hello world') # => will write 'hello world' if there is no key 'hello'
db.fetch('hello'){ |key| 'hello world' } # => same as above
# Deleting
db.delete('hello')
# Iterating
db.each { |key, val| puts "Key: #{key}, Val: #{val}" }
db.reverse_each { |key, val| puts "Key: #{key}, Val: #{val}" }
db.keys
db.values
db.map { |k,v| do_some_with(k, v) }
db.reduce([]) { |memo, (k, v)| memo << k + v; memo }
db.each # => enumerator
db.reverse_each # => enumerator
# Ranges
db.range('c', 'd') { |k,v| do_some_with_only_keys_in_range }
db.reverse_range('c', 'd') # => same as above but results are in reverse order
db.range(...) # => enumerable
# Batches
db.batch do |b|
b.put 'a', 1
b.put 'b', 2
b.delete 'c'
end
b = db.batch
b.put 'a', 1
b.put 'b', 2
b.delete 'c'
b.write!
# Snapshots
db.put 'a', 1
db.put 'b', 2
db.put 'c', 3
snap = db.snapshot
db.delete 'a'
db.get 'a' # => nil
snap.set!
db.get('a') # => 1
snap.reset!
db.get('a') # => nil
snap.set!
db.get('a') # => 1
# Properties
db.read_property('leveldb.stats')
# Level Files Size(MB) Time(sec) Read(MB) Write(MB)
# --------------------------------------------------
# 0 1 0 0 0 0
# 1 1 0 0 0 0
# same of:
db.stats
Benchmarks
Preface: those are only for general purpose, I know that zedshaw will kill me for this, but ... on my mac:
Model Identifier: MacBookPro10,1
Processor Name: Intel Core i7
Processor Speed: 2.3 GHz
Number of Processors: 1
Total Number of Cores: 4
L2 Cache (per Core): 256 KB
L3 Cache: 6 MB
Memory: 8 GB
The benchmark code is in benchmark/leveldb.rb
Writing/Reading 100mb
of very random data of 10kb
each:
Without compression:
user system total real
put 0.530000 0.310000 0.840000 ( 1.420387)
get 0.800000 0.460000 1.260000 ( 2.626631)
Level Files Size(MB) Time(sec) Read(MB) Write(MB)
--------------------------------------------------
0 1 0 0 0 0
2 50 98 0 0 0
3 1 2 0 0 0
With compression:
user system total real
put 0.850000 0.320000 1.170000 ( 1.721609)
get 1.160000 0.480000 1.640000 ( 2.703543)
Level Files Size(MB) Time(sec) Read(MB) Write(MB)
--------------------------------------------------
0 1 0 0 0 0
1 5 10 0 0 0
2 45 90 0 0 0
NOTE: as you can see snappy
can't compress that kind of very very
random data, but I was not interested to bench snappy (as a compressor) but
only to see how (eventually) much slower will be using it. As you can see,
only a few and on normal data the db size will be much much better!
With batch:
user system total real
put 0.260000 0.170000 0.430000 ( 0.433407)
Level Files Size(MB) Time(sec) Read(MB) Write(MB)
--------------------------------------------------
0 1 100 1 0 100
Difference between a c++ pure ruby impl?
This, again, only for general purpose, but I want to compare the c++
implementation
of leveldb-ruby with this that use ffi.
I'm aware that this lib is 1 year older, but for those who cares, the basic bench:
user system total real
put 0.440000 0.300000 0.740000 ( 1.363188)
get 0.440000 0.440000 1.460000 ( 2.407274)
Todo
- Add pluggable serializers
- Custom comparators
Contributing
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request