Resource Index¶ ↑
Uses Xapian index search to speed up searching ActiveResource objects and reduce calls to backend services.
ActiveResource is good for gathering all objects or one object from a remote service api. ResourceIndex provides tools to help you identify the one from the all, in an efficient way, and without many calls to the remote service.
Designed for ActiveResource but not dependent on it¶ ↑
ResourceIndex has been designed to work with ActiveResource objects, but can be used on other objects. For example, the tests run on ActiveRecord objects.
See /test/dummy/app/models/thing.rb
Installation¶ ↑
Add this to your Gemfile:
gem 'xapian-ruby' # or your preferred method of installing xapian with ruby bindings gem 'resource_index'
ResourceIndex uses Xapian, and therefore this needs to be installed, along with bindings for ruby. I have found the easiest way to achieve this is to use the ‘xapian-ruby’ gem.
However, if you already have xapian installed, or wish to use a shared Xapian engine, you may wish to install Xapian and the ruby bindings manually. For installation information see: xapian.org
Also if ‘xapian-ruby’ fails to install, it can be easier to diagnose the underlying issue by trying to install xapian and the bindings manually. For example, do discover that you need to install ‘uuid-dev’ first on some ubuntu systems.
Usage ¶ ↑
class Thing < ActiveResource::Base extend ResourceIndex::SearchEngine end
Thing gets the the following class methods:
- Thing.search
-
The main method used to find objects
- Thing.search_engine
-
Container for the xapian db engine being used by Thing
- Thing.populate_search_engine
-
Populates the thing index with the data from all Things
Search¶ ↑
To get all the things containing ‘Something’:
search = Thing.search('Something') things = search.collect{|s| Thing.find(s.id)}
Search results are weighted and returned in the order of the most relevant. Google ‘xapian weighting’ for more information on the algorithms used. To find the most relevant ‘something’:
search = Thing.search('Something') thing = Thing.find(search.first.id)
The search results contain the attributes of the objects being searched, and therefore you do not have to retrieve the object to provide meaningful information:
search = Thing.search('Something') names = search.collect{|match| match.values[:name]}
Search Engine¶ ↑
ResoureIndex uses xapian-fu’s XapianDb to provide its functionality. The current XapianDb object is exposed via the search_engine class method. See github.com/johnl/xapian-fu
By default, when ResourceIndex is installed within a Rails app, the index databases will be installed at /db/resource_index. You can specify a different location:
ResourceIndex::Search.root = 'path/to/dir'
If you specify the location, the path to that location should exist or you will get errors.
Each indexed object gets its own folder within the search root, so you do not have to specify a root for each object. For example, the index files for Thing will be created in db/resource_index/thing/
To prevent these files being version controlled, add this to your project root .gitignore file (or equivalent):
/db/resource_index/*
Populate Search Engine¶ ↑
The populate_search_engine method will replace the existing Xapian database with one populated from all the objects currently returned by the remote service.
Thing.populate_search_engine
This will add all the things gathered by Thing.all, and create index entries for them.
There is also a rake task that runs this method on a given resource:
rake resource_index:populate resource=Thing
To populate a production system, use:
rake resource_index:populate resource=Thing RAILS_ENV=production
Spelling correction hints¶ ↑
Xapian provides a spelling correction prompt if no records match the current search:
Thing.create :name => 'Moose' results = Thing.search 'mouse' unless results.corrected_query.empty? puts "Did you mean '#{results.corrected_query}'" --> "Did you mean 'moose'" end
Search options¶ ↑
You can pass options through to the underlying xapian_fu object. See github.com/johnl/xapian-fu
For example:
Thing.search(:all, :page => 1, :per_page => 20)
Note that by default, search returns the first 10 results.