PristineText
This gem uses unicode_utils to lowercase text, removes non-letters, strips and squeezes whitespace, then optionally uses stemwords (from libstemming-tools) to stem every word.
Installation
Add this line to your application's Gemfile:
gem 'pristine_text'
And then execute:
$ bundle
Or install it yourself as:
$ gem install pristine_text
Usage
require "pristine_text"
puts PristineText.clean("haberler geliyorlar gidiyorlar", :tr)
Contributing
- Fork it ( https://github.com/nurettin/pristine_text/fork )
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create a new Pull Request