☗ Mikon
Mikon is a flexible data structure for Ruby language, inspired by data.frame of R and Pandas of Python. Its goal is to make it easy to manipulate the real data, apply statistical function to it and visualize the result in Ruby language.
It is compatible with Nyaplot::DataFrame
and Statsample::Vector
, and most methods the both gem have can be applied to Mikon's data structure.
Main Features:
Dependencies
- CRuby >= 2.0.0-p451
- NMatrix >= v0.1.0.rc5
- Formatador >= 0.2.5
Optional Dependencies
- Nyaplot: for plotting
- Statsample: for statistical function
- IRuby: for the interactive manipulation of data
Installation
$ gem install mikon
If you fail to install nmatrix, try:
$ sudo apt-get install libatlas-base-dev
$ sudo apt-get --purge remove liblapack-dev liblapack3 liblapack3gf
$ gem install nmatrix -- --with-opt-include=/usr/include/atlas
More detailed instructions are available for Mac and Linux.
If you fail to install iruby, try this.
Examples
Notebooks created with IRuby:
Usage
Initializing DataFrame
require 'mikon'
df2 = Mikon::DataFrame.new([{a: 1, b: 2}, {a: 2, b: 3}, {a: 3, b: 4}])
Mikon::DataFrame.new({a: [1,2,3,4], b: [2,3,4,5]}, index: [:a, :b, :c, :d])
df = Mikon::DataFrame.from_csv("~/data.csv")
Basic data manipulating
df[:value]
df[10..20]
df.head(2)
df.tail(2)
Row-based data manipulating
df.select{value > 100}
df2.map{b+1}.name(:c)
foo = []
df.each{foo.push(2*a)}
p foo #-> [2,4,6]
df.insert_column(:new_value){value * 2}
df.any?{value >= 100} #-> true
df.all?{valu > 1} #-> false
Column-based data manipulating
In most cases column-based manipulating is faster than Row-based.
df2[:b] - df2[:a]
df.insert_column(:new_value, df[:value]*2)
Plotting
df[:value].plot
Plotting with Nyaplot
require 'nyaplot'
plot = Nyaplot::Plot.new
plot.add_with_df(df, :histogram, :value)
plot
Statistical with Statsample
Mikon::Series
is compatible with Statsample::Vector
, so most methods of Statsample can be applied to Mikon::Series
.
require 'statsample'
Statsample::Analysis.store(Statsample::Test::T) do
t_2 = Statsample::Test.t_two_samples_independent(df1[:value], df1[:new_value])
summary t_2
end
Statsample::Analysis.run_batch
License
MIT License
Acknowledgement
Ruby Association Grant 2014 has been earmarked for the development of Mikon.
Contributing
- Fork it ( http://github.com/domitry/mikon/fork )
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Run tests by running
rspec
on/path_to_gem/mikon/
- Push to the branch (
git push origin my-new-feature
) - Create new Pull Request