Project

scripsi

0.0
No commit activity in last 3 years
No release in over 3 years
a flexible text-searching library built on top of redis
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

>= 2.1.1
 Project Readme

Scripsi

A flexible text-searching library built on top of redis.

Sorted suffix indexing

Sorted suffix indexing allows you to search for any substring within a set of documents. First, index a collection of documents and associated ids.

require 'scripsi'
Scripsi.connect  # connect to a running redis server

ssi = Scripsi::SortedSuffixIndexer.new "myindexer"
ssi.index('1',"Epistulam ad te scripsi.")
ssi.index('2',"I've written you a letter.")
ssi.index('3',"Quisnam Tusculo espistulam me misit?")
ssi.index('4',"Who in Tusculum would've sent me a letter?")

You can then search for any substring, and the indexer will return the ids of the documents where that substring appears.

ssi = Scripsi.indexer "myindexer"
ssi.search("te")        # => ["1","2","4"]
ssi.search("Tuscul")    # => ["3","4"]
ssi.search("Tusculu")   # => ["4"]
ssi.search("you a le")  # => ["2"]

If we want to get more information about the match, we can use the matches method:

match = ssi.matches("you a le").first
match.doc    # => "2"
match.start  # => 13
match.end    # => 21

ssi.documents[match.doc][match.start...match.end]  # => "you a le"

You can also retrive the stored documents efficiently:

ssi.documents  # lazy list of documents
ssi.documents['3']  # document with id '3'