Hintable Levenshtein¶ ↑
Levenshtein distances but with extra hints. Perhaps adding or deleting a space is not as big as a change as other things, or substituting a ‘c’ for a ‘k’ is again a cheaper operation than just any arbitrary change.
Just an example
english_rules = [ HintableLevenshtein::RuleSet.new(0.3, HintableLevenshtein::Rule.insert(/[\.,!]/)), HintableLevenshtein::RuleSet.new(0.3, HintableLevenshtein::Rule.delete(/[\.,!]/)), HintableLevenshtein::RuleSet.new(0.4, HintableLevenshtein::Rule.substitute('!' => '.')), HintableLevenshtein::RuleSet.new(0.4, HintableLevenshtein::Rule.substitute('!' => ',')), HintableLevenshtein::RuleSet.new(0.75, HintableLevenshtein::Rule.insert(' '), HintableLevenshtein::Rule.insert(' ')), HintableLevenshtein::RuleSet.new(0.5, HintableLevenshtein::Rule.insert(' ')), HintableLevenshtein::RuleSet.new(0.5, HintableLevenshtein::Rule.delete(' ')), HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('z' => 's')), HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('k' => 'c')), HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('u' => 'o')), HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('e' => 'a')), HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('i' => 'y')), HintableLevenshtein::RuleSet.new(1, HintableLevenshtein::Rule.delete), HintableLevenshtein::RuleSet.new(1, HintableLevenshtein::Rule.insert), HintableLevenshtein::RuleSet.new(1, HintableLevenshtein::Rule.substitute) ] a = "hello kitten pizza!!" b = "hello cittin pissssa.." puts "normal levenshtein: #{HintableLevenshtein.new.distance(a, b)}" puts "hinted levenshtein: #{HintableLevenshtein.new(english_rules).distance(a, b)}"
Would output:
normal levenshtein: 11.0 hinted levenshtein: 7.15