Repository is archived
No commit activity in last 3 years
No release in over 3 years
Simple utilities built ontop of Elasticsearch
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 1.6
~> 10.0

Runtime

 Project Readme

Elasticsearch::Utils

Gem Version

Adds more cool methods to Elasticsearch::Client clients.

Installation

Add this line to your application's Gemfile:

gem 'elasticsearch-utils'

And then execute:

$ bundle

Or install it yourself as:

$ gem install elasticsearch-utils

Usage

Streaming

For those times when you want to map over all results of a search (which may be very large, perhaps in a background job) and not worry about paging. This method leverages the scroll feature of ElasticSearch to maximize server-side efficiency.

In this example, we run a search for all Bobs in the index and output their last name. There are a ton of bobs in Bobland so the deep paging would normally tax the server, so we opt to stream.

client = Elasticsearch::Client.new my_elasticsearch_config

search_body = {
  query: {
    match: {
      name_first: 'bob'
    }
  }
}

search_params = index: :bobland, type: :person, body: search_body

client.stream search_params do |doc|
  puts doc['_source']['name_last']
end

You can pass a memo variable to the block to track state in subsequent results. The memo will become the return value of your block and will be passed to the next iteration.

bob_families = SortedSet.new
bob_families = client.stream search_params do |doc, bob_families|
  bob_families << doc['_source']['name_last']
end

puts "There are #{bob_families.count} families of bobs!"

To stop streaming, throw :stop_stream like so:

memo = client.stream search_params do |doc, memo|
  # If you are not using `memo`, you could also use `break`
  throw :stop_stream if memo > 10000

  # Use memo to count total results processed
  memo += 1
end

puts "Streamed #{memo} bobs!"

If sorting is not important for your query, even greater efficiency can be achieved by setting the search_type to scan like so:

search_params[:search_type] = :scan

client.stream search_params do |doc|
  # handle each bob out of order
end

Contributing

  1. Fork it ( https://github.com/[my-github-username]/elasticsearch-utils/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request