Japanese random sentence generator based on Markov chain.
Installation
$ gem install kusari
Usage
First of all, our application must load the gem and create a new instance as:
require 'kusari'
generator = Kusari::Generator.new
# by default, the above statement is the same as:
# generator = Kusari::Generator.new(3, "./ipadic")
Note that the first argument 3
indicates N for the N-gram model used by creating tokenized word table. You can give arbitrary number. And the second one ./ipadic
tells the path of IPA dictionary, a dictionary for parsing Japanese strings, to the generator.
Next, adding strings (reference sentences for Markov chain) can be done by:
generator.add_string("ăăăšăăă©ăă·ă„ăŻăăăźäžă§äșäșșăăă§ăăă")
generator.add_string("ćœŒăăŻăćźăźć
ćŒăăăä»Čăźăă性ăźèŠȘćă§ăăă")
generator.add_string("ăăăŻăăąă«ăăłăçăŸăăźć°ćčŽă§ăăă")
In addition to the above operations, we can save the tokenized word table on local as:
generator.save("tokenized_table.markov")
And it can be loaded by:
generator.load("tokenized_table.markov")
Finally, we can obtain randomly generated sentence as:
generator.generate(140)
# => "ăăăŻăăąă«ăăłăçăŸăăźć
ćŒăăăä»Čăźăă性ăźć°ćčŽă§ăăă"
Here, an argument of the generate method defines limit length for the generated sentence; generator.generate(140)
creates a sentence which can be posted on Twitter, for example.
License
MIT