textacular
DESCRIPTION:
Textacular exposes full text search capabilities from PostgreSQL, extending ActiveRecord with scopes making search easy and fun!
FEATURES/PROBLEMS:
- Only works with PostgreSQL
- Anything that mucks with the
SELECT
statement (notablypluck
), is likely to cause problems.
SYNOPSIS:
Quick Start
In the project's Gemfile add
gem 'textacular', '~> 5.0'
Rails 3, Rails 4
In the project's Gemfile add
gem 'textacular', '~> 4.0'
ActiveRecord outside of Rails
require 'textacular'
ActiveRecord::Base.extend(Textacular)
Usage
Your models now have access to search methods:
The #basic_search
method is what you might expect: it looks literally for what
you send to it, doing nothing fancy with the input:
Game.basic_search('Sonic') # will search through the model's :string columns
Game.basic_search(title: 'Mario', system: 'Nintendo')
The #advanced_search
method lets you use Postgres's search syntax like '|',
'&' and '!' ('or', 'and', and 'not') as well as some other craziness. The ideal
use for advanced_search is to take a search DSL you make up for your users and
translate it to PG's syntax. If for some reason you want to put user input
directly into an advanced search, you should be sure to catch exceptions from
syntax errors. Check [the Postgres docs]
(http://www.postgresql.org/docs/9.2/static/datatype-textsearch.html) for more:
Game.advanced_search(title: 'Street|Fantasy')
Game.advanced_search(system: '!PS2')
The #web_search
method lets you use Postgres' 11+ websearch_to_tsquery
function
supporting websearch like syntax:
- unquoted text: text not inside quote marks will be converted to terms separated by & operators, as if processed by plainto_tsquery.
- "quoted text": text inside quote marks will be converted to terms separated by <-> operators, as if processed by phraseto_tsquery.
- OR: logical or will be converted to the | operator.
- -: the logical not operator, converted to the the ! operator.
Game.web_search(title: '"Street Fantasy"')
Game.web_search(title: 'Street OR Fantasy')
Game.web_search(system: '-PS2')
Finally, the #fuzzy_search
method lets you use Postgres's trigram search
functionality.
In order to use this, you'll need to make sure your database has the pg_trgm
module installed. Create and run a migration to install the module:
rake textacular:create_trigram_migration
rake db:migrate
Once that's installed, you can use it like this:
Comic.fuzzy_search(title: 'Questio') # matches Questionable Content
Note that fuzzy searches are subject to a similarity threshold imposed by the pg_trgm
module. The default is 0.3, meaning that at least 30% of the total string must match your search content. For example:
Comic.fuzzy_search(title: 'Pearls') # matches Pearls Before Swine
Comic.fuzzy_search(title: 'Pear') # does not match Pearls Before Swine
The similarity threshold is hardcoded in PostgreSQL and can be modified on a per-connection basis, for example:
ActiveRecord::Base.connection.execute("SELECT set_limit(0.9);")
For more info, view the pg_trgm
documentation, specifically F.35.2. Functions and Operators.
Searches are also chainable:
Game.fuzzy_search(title: 'tree').basic_search(system: 'SNES')
If you want to search on two or more fields with the OR operator use a hash for the conditions and pass false as the second parameter:
Game.basic_search({name: 'Mario', nickname: 'Mario'}, false)
Setting Language
To set proper searching dictionary just override class method on your model:
def self.searchable_language
'russian'
end
And all your queries would go right! And don`t forget to change the migration for indexes, like shown below.
Setting Searchable Columns
To change the default behavior of searching all text and string columns, override the searchable_columns class method on your model:
def self.searchable_columns
[:column1, :column2]
end
Creating Indexes for Super Speed
You can have Postgresql use an index for the full-text search. To declare a full-text index, in a migration add code like the following:
For basic_search
add_index :email_logs, %{to_tsvector('english', subject)}, using: :gin
add_index :email_logs, %{to_tsvector('english', email_address)}, using: :gin
For fuzzy_search
add_index :email_logs, :subject, using: :gist, opclass: :gist_trgm_ops
add_index :email_logs, :email_address, using: :gist, opclass: :gist_trgm_ops
In the above example, the table email_logs has two text columns that we search against, subject and email_address. You will need to add an index for every text/string column you query against, or else Postgresql will revert to a full table scan instead of using the indexes.
REQUIREMENTS:
- ActiveRecord
- Ruby 1.9.2
INSTALL:
$ gem install textacular
Contributing
If you'd like to contribute, please see the contribution guidelines.
Releasing
Maintainers: Please make sure to follow the release steps when it's time to cut a new release.
LICENSE:
(The MIT License)
Copyright (c) 2011 Aaron Patterson
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the 'Software'), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.