Project

kusari

0.0
No commit activity in last 3 years
No release in over 3 years
Japanese random sentence generator based on Markov chain.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 1.10
~> 10.0
>= 0

Runtime

~> 0.1.5
 Project Readme

🔗 Kusari Gem Version Build Status

Japanese random sentence generator based on Markov chain.

Installation

$ gem install kusari

Usage

First of all, our application must load the gem and create a new instance as:

require 'kusari'
generator = Kusari::Generator.new
# by default, the above statement is the same as:
#   generator = Kusari::Generator.new(3, "./ipadic")

Note that the first argument 3 indicates N for the N-gram model used by creating tokenized word table. You can give arbitrary number. And the second one ./ipadic tells the path of IPA dictionary, a dictionary for parsing Japanese strings, to the generator.

Next, adding strings (reference sentences for Markov chain) can be done by:

generator.add_string("ăƒăƒ­ăšăƒ‘ăƒˆăƒ©ăƒƒă‚·ăƒ„ăŻă€ă“ăźäž–ă§äșŒäșșきりでした。")
generator.add_string("ćœŒă‚‰ăŻă€ćźŸăźć…„ćŒŸă‚ˆă‚Šă‚‚ä»Čăźă‚ˆă„ć€§ăźèŠȘ揋でした。")
generator.add_string("ăƒăƒ­ăŻă€ă‚ąăƒ«ăƒ‡ăƒłăƒç”ŸăŸă‚Œăźć°‘ćčŽă§ă—ăŸă€‚")

In addition to the above operations, we can save the tokenized word table on local as:

generator.save("tokenized_table.markov")

And it can be loaded by:

generator.load("tokenized_table.markov")

Finally, we can obtain randomly generated sentence as:

generator.generate(140)
# => "ăƒăƒ­ăŻă€ă‚ąăƒ«ăƒ‡ăƒłăƒç”ŸăŸă‚Œăźć…„ćŒŸă‚ˆă‚Šă‚‚ä»Čăźă‚ˆă„ć€§ăźć°‘ćčŽă§ă—ăŸă€‚"

Here, an argument of the generate method defines limit length for the generated sentence; generator.generate(140) creates a sentence which can be posted on Twitter, for example.

License

MIT