Project

jieba-rb

0.0
No commit activity in last 3 years
No release in over 3 years
cppjieba binding for ruby
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

 Project Readme

JiebaRb

Gem Version

Build Status

Ruby extension for Cppjieba

Installation

Add this line to your application's Gemfile:

gem 'jieba_rb'

And then execute:

$ bundle

Or install it yourself as:

$ gem install jieba_rb

Word segment Usage

Mix Segment mode (HMM with Max Prob, default):

require 'jieba_rb'
seg = JiebaRb::Segment.new  # equivalent to "JiebaRb::Segment.new mode: :mix"
words = seg.cut "令狐冲是云计算行业的专家"
# 令狐冲 是 云 计算 行业 的 专家

Mix Segment mode with user-defined dictionary:

seg = JiebaRb::Segment.new mode: :mix, user_dict: "ext/cppjieba/dict/user.dict.utf8"
words = seg.cut "令狐冲是云计算行业的专家"
# 令狐冲 是 云计算 行业 的 专家

HMM or Max probability (mp) Segment mode:

seg = JiebaRb::Segment.new mode: :hmm # or  mode: :mp
words = seg.cut "令狐冲是云计算行业的专家"

Word tagging Usage

Default tagging:

require 'jieba_rb'
tagging = JiebaRb::Tagging.new
pairs = tagging.tag "我是蓝翔技工拖拉机学院手扶拖拉机专业的。"
# [{"我"=>"r"}, {"是"=>"v"}, {"蓝翔"=>"x"}, {"技工"=>"n"}, {"拖拉机"=>"n"}, {"学院"=>"n"}, {"手扶拖拉机"=>"n"}, {"专业"=>"n"}, {"的"=>"uj"}, {"。"=>"x"}]

Tagging with user-defined dictionary:

require 'jieba_rb'
tagging = JiebaRb::Tagging.new user_dict: :default
pairs = tagging.tag "我是蓝翔技工拖拉机学院手扶拖拉机专业的。"
# [{"我"=>"r"}, {"是"=>"v"}, {"蓝翔"=>"nz"}, {"技工"=>"n"}, {"拖拉机"=>"n"}, {"学院"=>"n"}, {"手扶拖拉机"=>"n"}, {"专业"=>"n"}, {"的"=>"uj"}, {"。"=>"x"}]

Keyword Extractor Usage

  • only support TF-IDF currently
    keyword = JiebaRb::Keyword.new
    keywords_weights = keyword.extract "我是拖拉机学院手扶拖拉机专业的。不用多久,我就会升职加薪,当上CEO,走上人生巅峰。", 5

                  [
                  ["CEO", 11.739204307083542],
                  ["升职", 10.8561552143],
                  ["加薪", 10.642581114],
                  ["手扶拖拉机", 10.0088573539],
                  ["巅峰", 9.49395840471]
                  ]

Contributing

  1. Fork it ( http://github.com//jieba_rb/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request