Elasticsearch::Utils
Adds more cool methods to Elasticsearch::Client
clients.
Installation
Add this line to your application's Gemfile:
gem 'elasticsearch-utils'
And then execute:
$ bundle
Or install it yourself as:
$ gem install elasticsearch-utils
Usage
Streaming
For those times when you want to map over all results of a search (which may be very large, perhaps in a background job) and not worry about paging. This method leverages the scroll
feature of ElasticSearch to maximize server-side efficiency.
In this example, we run a search for all Bobs in the index and output their last name. There are a ton of bobs in Bobland so the deep paging would normally tax the server, so we opt to stream.
client = Elasticsearch::Client.new my_elasticsearch_config
search_body = {
query: {
match: {
name_first: 'bob'
}
}
}
search_params = index: :bobland, type: :person, body: search_body
client.stream search_params do |doc|
puts doc['_source']['name_last']
end
You can pass a memo
variable to the block to track state in subsequent results. The memo will become the return value of your block and will be passed to the next iteration.
bob_families = SortedSet.new
bob_families = client.stream search_params do |doc, bob_families|
bob_families << doc['_source']['name_last']
end
puts "There are #{bob_families.count} families of bobs!"
To stop streaming, throw :stop_stream
like so:
memo = client.stream search_params do |doc, memo|
# If you are not using `memo`, you could also use `break`
throw :stop_stream if memo > 10000
# Use memo to count total results processed
memo += 1
end
puts "Streamed #{memo} bobs!"
If sorting is not important for your query, even greater efficiency can be achieved by setting the search_type
to scan
like so:
search_params[:search_type] = :scan
client.stream search_params do |doc|
# handle each bob out of order
end
Contributing
- Fork it ( https://github.com/[my-github-username]/elasticsearch-utils/fork )
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create a new Pull Request