0.0
No commit activity in last 3 years
No release in over 3 years
Run Real Experiment and Calculate Non-Parametric Statistics.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

~> 1.6
= 4.5.0
>= 0
~> 0.9

Runtime

~> 0.7.3
 Project Readme

PullReview stats

Benchmark Lab

Run Real Experiment and Calculate Non-Parametric Statistics.

Requirements

The ruby version required is at least 2.1.

Installation

Install it yourself as:

$ gem install benchmark-lab

Usage

There are two ways to use it:

  1. classic: as Benchmark.bm does
  2. iterative: collects and measures separately, stores into different JSON files, then put everything together and rank them

Classic Usage

require 'benchmark/lab'

n = 5_000_000
cases = {
  'for:'    => proc { for i in 1..n; a = "1"; end },
  'times:'  => proc { n.times do   ; a = "1"; end },
  'upto:'   => proc { 1.upto(n) do ; a = "1"; end }
}

# How many times do you run the function
# 20 is a good minimum number
nbr_of_samples = 20

Benchmark.experiment(nbr_of_samples) do |x|
  cases.each { |label, blk| x.report(label, &blk) }
end

The output looks like the following:

          user                system              total               real
for:      [0.77,0.77,0.78]    [0.00,0.00,0.00]    [0.77,0.77,0.78]    [0.77,0.77,0.78]
times:    [0.74,0.74,0.74]    [0.00,0.00,0.00]    [0.74,0.74,0.74]    [0.74,0.74,0.74]
upto:     [0.75,0.75,0.75]    [0.00,0.00,0.00]    [0.75,0.75,0.75]    [0.75,0.75,0.75]
The best "times:" is significantly (95%) better (total time).

Iterative Usage

require 'benchmark/lab'

n = 5_000_000

# How many times do you run the function
# 20 is a good minimum number
nbr_of_samples = 20

jsons = []

jsons << Benchmark.observe_and_summarize(nbr_of_samples) do |x|
  x.report('for') { for i in 1..n; a = "1"; end }
end

jsons << Benchmark.observe_and_summarize(nbr_of_samples) do |x|
  x.report('times') { n.times do   ; a = "1"; end }
end

jsons << Benchmark.observe_and_summarize(nbr_of_samples) do |x|
  x.report('upto') { 1.upto(n) do ; a = "1"; end }
end

best, is_h0_rejected = Benchmark.aggregate_and_rank(jsons.map { |json| JSON.parse(json) })

puts best
puts is_h0_rejected

The output looks like the following:

{"name"=>"total", "sample"=>[0.6899999999999977, 0.6899999999999977, 0.6899999999999977, 0.6899999999999977, 0.6900000000000013, 0.6900000000000048, 0.6900000000000048, 0.6999999999999957, 0.6999999999999957, 0.6999999999999957, 0.6999999999999957, 0.6999999999999957, 0.6999999999999993, 0.6999999999999993, 0.7000000000000028, 0.7000000000000028, 0.7000000000000028, 0.7000000000000028, 0.7000000000000028, 0.7000000000000028], "sample_size"=>20, "minimum"=>0.6899999999999977, "maximum"=>0.7000000000000028, "first_quartile"=>0.690000000000003, "third_quartile"=>0.7000000000000028, "median"=>0.6999999999999957, "interquartile_range"=>0.009999999999999787, "label"=>"upto"}
true

Ideas

  • compare two different implementations of a same function
    1. get the stats, then compare
    2. use git (commit, branch)
    3. use tests to check no performance regression at the same time
    4. annotate the tests you want to check
  • decide the sample size automatically (based on the power you want to reach)
  • explain correctly why we should do that

Contributing

  1. Fork it ( https://github.com/toch/benchmark-lab/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request