Project

baobab

0.0
No commit activity in last 3 years
No release in over 3 years
ID3 decision trees for machine learning in Ruby
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies
 Project Readme

baobab

Build Status Gem Version

An implementation of the ID3 (Iterative Dichotomiser 3) in Ruby

Installation

Use RubyGems:

gem install baobab

How to run the tests

rake tests

Usage

Load baobab:

require `baobab` # if installed as a gem
load `baobab.rb` # if using a copy of the repo

Create your dataset from a JSON file. Your dataset should be a list of objects variable names and values as key-value pairs. See examples in the sample tests.

# A dataset on whether or not we go out to play depending on the outlook,
# humidity, and if it's windy
dataset = Dataset::from_json('test/weather.json')

Create a tree based on the dataaset and provide the name of the class variable.

tree = DecisionTree.new dataset, 'play'

Check your decision tree:

puts tree.to_s
# ROOT (0.94) # the number parenthesized is Shannon's entropy
#   outlook => rainy (0.694)
#     windy => TRUE (0.0)
#       play => no (0.0)
#     windy => FALSE (0.0)
#       play => yes (0.0)
#   outlook => overcast (0.694)
#     play => yes (0.0)
#   outlook => sunny (0.694)
#     humidity => normal (0.0)
#       play => yes (0.0)
#     humidity => high (0.0)
#       play => no (0.0)

You can make queries to your tree:

tree.query({"windy" => "yes", "outlook" => "rainy", "humidity" => "high"})
# no <-- we won't go out to play :(

Fun!

Sources of the datasets

The weather dataset has been adapted from the weather.nominal.arff that comes shipped with Weka.

The transportation dataset was taken from the example data in https://www.youtube.com/watch?v=wL9aogTuZw8.

The breast cancer dataset is adapted from the breast-cancer.arff file that comes shipped with Weka (adapted so it doesn't have unknown values). It should be attributed to:

Matjaz Zwitter & Milan Soklic (physicians), Institute of Oncology, University Medical Center, Ljubljana, Yugoslavia. Donors: Ming Tan and Jeff Schlimmer (Jeffrey.Schlimmer@a.gp.cs.cmu.edu). Date: 11 July 1988.