Project

koota

0.0
No commit activity in last 3 years
No release in over 3 years
Koota generates words given a pattern.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 2.0
~> 0.20
~> 13.0
~> 3.0
>= 0.75
~> 0.17

Runtime

~> 4.7
 Project Readme

Koota

Koota (pronounced /ˈkoː.tɑ/, means “to assemble” in Finnish) generates words based on a pattern, similar to Awkwords.

It was created as an experiment to see if we could compile patterns down to bytecode that can be executed by a word generator virtual machine. It is possible, of course!

You may ask… why? Well, I dunno. ¯\_(ツ)_/¯

Installation

Add this line to your application’s Gemfile:

gem 'koota'

And then execute:

bundle

Or install it yourself as:

gem install koota

Usage

Pattern syntax

A pattern can be anything you want, but there are some special characters.

A pair of parentheses forms an optional block. So if you have hell(o), two words can be generated: “hell” and “hello”. You can also have choices within those: hell(a/e/i/o/u) will generate “hell” then any English vowel. You can always nest them: (h(e))ll(o) can generate “ll”, “hll”, “hllo”, “hell”, and “hello”. Note that “ello”, for example, can’t be generated, because (e) is within (h(e)) so “h” will always have to be picked so that “e” itself can be picked.

A pair of square brackets, like [...], does the same thing as parentheses except they’re not optional.

Slashes defines choices to be picked at random by Koota, so a/b/c/d is a choice between a, b, c, and d. Note that the characters within the slashes can be of any length you want. Note that you cannot put parentheses or square brackets within slashes, like a/(b/c)/[d/e]. That’s illegal; you should use subpatterns for that.

If a single character corresponds to a subpattern, then it stands for that subpattern. If you have the following .koota file:

C = p/t/k
v = a/i/u

Cv

Then the Cv there is as if it were [p/t/k][a/i/u]. To bypass this, use quotes: "Cv" is taken as-is and the only generatable word is “Cv”. Anything within quotes is taken as-is.

My recommendation is that you reserve uppercase characters for subpatterns, and only use lowercase characters for raw characters.

Formal syntax

The syntax in ANTLR format:

grammar Koota;

pattern : group+ ;

group : '(' pattern ')'
      | '[' pattern ']'
      | choice
      ;

choice : atom ('/' atom)* ;

atom : ~[()[\]/"]* // i.e. anything except groups, choices, and quotes
     | STRING
     ;

STRING : '"' .*? '"' ;

Command-line usage

Koota ships with an executable to run your generationings handily. It takes a file as an input:

koota my-patterns.koota

The pattern file is simple:

# Comments with ye olde hash
# Each line is a 'name = pattern' association.
N = m/n
C = p/t/k/b/d/g/s/N # you can refer to patterns
V = a/e/i/o/u

# If a pattern doesn't have a name, it's the root pattern generated by Koota.
# All .koota files need to have this!
(C)V(N)

# Make sure the file is UTF-8 encoded, with no BOM, for the best results.

After that, you’ll get 100 fresh words right out of the generation oven, like this:

$ koota my-patterns.koota
pa ti ken son na hu ...

Word amount

You can change the amount of words generated with --words (or -w):

$ koota --words=5 my-patterns.koota
pa ti ken son ha
# and it's over

Syllables and syllabification

With the above file, however, they’ll all be single-syllable words. You can change this with a command-line option:

koota -s 3 my-patterns.koota
# or
koota --syllables=3 my-patterns.koota

This will generate exactly 3 syllables per word. If you want to vary the amount of syllables per word, use this:

koota --syllables=1,3 my-patterns.koota

This will generate 1 to 3 syllables per word, randomly.

You can also automatically syllabificate with --syllable-separator (or -r):

koota -s 1,3 -r '.' my-patterns.koota

Which will generate words like ta.ka, na.po.ke, etc. By default it is empty, which does away with syllabification.

Duplicate words

Duplicate words are automatically pruned, so you may get less than 100 words. To disable this behaviour, pass the --duplicates (or -d) command-line option.

Word separator

Each word is separated by the separator given in the --word-separator (or -p) option. The default is a space. To output each word in a new line, for example, you could pass --word-separator="\n".

Getting the bytecode

You can go mad and get the bytecode with --bytecode. It’ll dump the bytecode on standard output, so best redirect it with something like > file if you don’t want your console to freak out!

After that, you can run directly from bytecode, just by passing the resulting file to koota:

koota my-patterns.koolla

Same thing happens, except you use the bytecode directly. Why would you want to do this? No idea.

Help and other options

To seek more help, use --help (or -h).

API

You can also run this as a library inside some Ruby code, of course. Use Koota::Pattern objects to compile patterns given their references:

require 'koota'

nasals     = Koota::Pattern.new('m/n')
vowels     = Koota::Pattern.new('a/e/i/o/u')
consonants = Koota::Pattern.new('p/t/k/b/d/g/s/N', N: nasals) # N reference, must pass!

Then, use a Koota::Generator, passing in the root pattern to #call:

root = Koota::Pattern.new('(C)V(N)', C: consonants, V: vowels, N: nasals)

generator = Koota::Generator.new
generator.call(root) # returns an Array<String> containing the generated words

You can pass many of the same command-line options to Koota::Generator#call:

generator.call(
  root,
  # Option: Default
  words: 100,             # Integer only
  syllables: 1,           # Integer or Range of Integer
  syllable_separator: '', # String only
  duplicates: false       # Boolean only
)

And you can get the bytecode for a generator with #bytecode:

generator.bytecode # returns an array of 8-bit integers

Documentation

For more info, see the documentation.

The Koota virtual machine

See VM.md for info on the virtual machine.

Development

After checking out the repo, run bundle to install dependencies. Then, run bundle exec rake to run rubocop followed by the tests, or just bundle exec rake spec for just the tests. You can also run ruby bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/unleashy/koota.

License

The gem is available as open source under the terms of the MIT License.