Numo::Libsvm
Numo::Libsvm is a Ruby gem binding to the LIBSVM library. LIBSVM is one of the famous libraries that implemented Support Vector Machines, and provides functions for support vector classifier, regression, and distribution estimation. Numo::Libsvm makes to use the LIBSVM functions with dataset represented by Numo::NArray.
Note: There are other useful Ruby gems binding to LIBSVM: rb-libsvm by C. Florian Ebeling, libsvm-ruby-swig by Tom Zeng, and jrb-libsvm by Andreas Eger.
Installation
Numo::Libsvm bundles LIBSVM. There is no need to install LIBSVM in advance.
Add this line to your application's Gemfile:
gem 'numo-libsvm'
And then execute:
$ bundle
Or install it yourself as:
$ gem install numo-libsvm
Usage
Preparation
In the following examples, we use red-datasets to download dataset.
$ gem install red-datasets-numo-narray
Example 1. Cross-validation
We conduct cross validation of support vector classifier on Iris dataset.
require 'numo/narray'
require 'numo/libsvm'
require 'datasets-numo-narray'
# Download Iris dataset.
puts 'Download dataset.'
iris = Datasets::LIBSVM.new('iris').to_narray
x = iris[true, 1..-1]
y = iris[true, 0]
# Define parameters of C-SVC with RBF Kernel.
param = {
svm_type: Numo::Libsvm::SvmType::C_SVC,
kernel_type: Numo::Libsvm::KernelType::RBF,
gamma: 1.0,
C: 1
}
# Perform 5-cross validation.
puts 'Perform cross validation.'
n_folds = 5
predicted = Numo::Libsvm.cv(x, y, param, n_folds)
# Print mean accuracy.
mean_accuracy = y.eq(predicted).count.fdiv(y.size)
puts "Accuracy: %.1f %%" % (100 * mean_accuracy)
Execution result in the following:
Download dataset.
Perform cross validation.
Accuracy: 96.0 %
Example 2. Pendigits dataset classification
We first train the support vector classifier with RBF kernel using training pendigits dataset.
require 'numo/narray'
require 'numo/libsvm'
require 'datasets-numo-narray'
# Download pendigits training dataset.
puts 'Download dataset.'
pendigits = Datasets::LIBSVM.new('pendigits').to_narray
x = pendigits[true, 1..-1]
y = pendigits[true, 0]
# Define parameters of C-SVC with RBF Kernel.
param = {
svm_type: Numo::Libsvm::SvmType::C_SVC,
kernel_type: Numo::Libsvm::KernelType::RBF,
gamma: 0.0001,
C: 10,
shrinking: true
}
# Perform training procedure.
puts 'Train support vector machine.'
model = Numo::Libsvm.train(x, y, param)
# Save parameters and trained model.
puts 'Save parameters and model with Marshal.'
File.open('pendigits.dat', 'wb') { |f| f.write(Marshal.dump([param, model])) }
$ ruby train.rb
Download dataset.
Train support vector machine.
Save paramters and model with Marshal.
We then predict labels of testing dataset, and evaluate the classifier.
require 'numo/narray'
require 'numo/libsvm'
require 'datasets-numo-narray'
# Download pendigits testing dataset.
puts 'Download dataset.'
pendigits_test = Datasets::LIBSVM.new('pendigits', note: 'testing').to_narray
x = pendigits_test[true, 1..-1]
y = pendigits_test[true, 0]
# Load parameter and model.
puts 'Load parameter and model.'
param, model = Marshal.load(File.binread('pendigits.dat'))
# Predict labels.
puts 'Predict labels.'
predicted = Numo::Libsvm.predict(x, param, model)
# Evaluate classification results.
mean_accuracy = y.eq(predicted).count.fdiv(y.size)
puts "Accuracy: %.1f %%" % (100 * mean_accuracy)
$ ruby test.rb
Download dataset.
Load parameter and model.
Predict labels.
Accuracy: 98.3 %
Note
The hyperparameter of SVM is given with Ruby Hash on Numo::Libsvm. The hash key of hyperparameter and its meaning match the struct svm_parameter of LIBSVM. The svm_parameter is detailed in LIBSVM README.
param = {
svm_type: # [Integer] Type of SVM
Numo::Libsvm::SvmType::C_SVC,
kernel_type: # [Integer] Type of kernel function
Numo::Libsvm::KernelType::RBF,
degree: 3, # [Integer] Degree in polynomial kernel function
gamma: 0.5, # [Float] Gamma in poly/rbf/sigmoid kernel function
coef0: 1.0, # [Float] Coefficient in poly/sigmoid kernel function
# for training procedure
cache_size: 100, # [Float] Cache memory size in MB
eps: 1e-3, # [Float] Tolerance of termination criterion
C: 1.0, # [Float] Parameter C of C-SVC, epsilon-SVR, and nu-SVR
nr_weight: 3, # [Integer] Number of weights for C-SVC
weight_label: # [Numo::Int32] Labels to add weight in C-SVC
Numo::Int32[0, 1, 2],
weight: # [Numo::DFloat] Weight values in C-SVC
Numo::DFloat[0.4, 0.4, 0.2],
nu: 0.5, # [Float] Parameter nu of nu-SVC, one-class SVM, and nu-SVR
p: 0.1, # [Float] Parameter epsilon in loss function of epsilon-SVR
shrinking: true, # [Boolean] Whether to use the shrinking heuristics
probability: false, # [Boolean] Whether to train a SVC or SVR model for probability estimates
verbose: false, # [Boolean] Whether to output learning process message
random_seed: 1 # [Integer/Nil] Random seed
}
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/yoshoku/numo-libsvm. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.
License
The gem is available as open source under the terms of the BSD-3-Clause License.