Classy Hash
Keep Your Hashes Classy (a lightweight Hash validation gem)
Classy Hash is a lightweight RubyGem for validating Ruby hashes against a simple schema Hash that indicates what data types are expected. Classy Hash will make sure your data matches, providing helpful error messages if it doesn't.
Classy Hash is fantastic for helping developers become familiar with an API, by letting them know exactly what they did wrong. It also guards against mistakes by verifying that incoming data meets expectations, and can serve as a convenient data structure documentation format.
Why Classy Hash?
Classy Hash was created as a lightweight alternative to the other good validation gems available. By taking advantage of built-in Ruby language features, Classy Hash can validate common Hashes much faster than some of the other gems we tested, with a dead simple schema syntax.
Classy Hash doesn't modify your Hashes or patch any core classes, so it's safe to use just about anywhere.
Classy Hash is thoroughly tested (see the Testing section below).
Finally, Classy Hash is fast:
Valid hashes:
Serializer | Validator | Ops | Ops/sec | Alloc/op | Ops/GC
-----------------+--------------------------+----------+------------+------------+-----------
msgpack | no_op | 200000 | 109200.1 | 28.0 | 1886.8
msgpack | classy_hash_no_raise | 200000 | 45786.7 | 32.0 | 1666.7
msgpack | classy_hash_full | 200000 | 45744.4 | 32.0 | 1666.7
msgpack | classy_hash_errors_array | 200000 | 45577.6 | 31.0 | 1724.1
msgpack | classy_hash | 200000 | 45421.5 | 31.0 | 1724.1
msgpack | classy_hash_strict | 200000 | 37348.8 | 43.0 | 1257.9
msgpack | classy_hash_full_strict | 200000 | 37024.5 | 44.0 | 1234.6
msgpack | hash_validator | 100000 | 22241.5 | 73.0 | 757.6
msgpack | schema_hash | 50000 | 18942.2 | 118.0 | 463.0
msgpack | json_schema | 8000 | 1207.1 | 1000.1 | 56.3
msgpack | json_schema_strict | 8000 | 1195.6 | 1009.0 | 55.9
msgpack | json_schema_full | 8000 | 1189.5 | 1013.0 | 55.6
Invalid hashes:
Serializer | Validator | Ops | Ops/sec | Alloc/op | Ops/GC
-----------------+--------------------------+----------+------------+------------+-----------
msgpack | classy_hash | 500000 | 55524.3 | 28.2 | 1865.7
msgpack | classy_hash_no_raise | 500000 | 54028.7 | 30.6 | 1730.1
msgpack | classy_hash_strict | 500000 | 45824.1 | 33.8 | 1577.3
msgpack | classy_hash_errors_array | 500000 | 40601.2 | 34.6 | 1533.7
msgpack | classy_hash_full | 500000 | 39895.5 | 35.4 | 1506.0
msgpack | classy_hash_full_strict | 500000 | 35024.4 | 41.0 | 1312.3
msgpack | hash_validator | 250000 | 20873.9 | 69.0 | 793.7
msgpack | json_schema_strict | 20000 | 1399.6 | 887.4 | 63.3
msgpack | json_schema | 20000 | 1398.0 | 891.6 | 63.1
msgpack | json_schema_full | 20000 | 1281.2 | 956.4 | 58.8
Examples
A Classy Hash schema can be as simple or as complex as you like. At the most basic level, you list each key your Hash is required to contain, with the expected Ruby class of the value.
For more examples, see benchmark.rb
and lib/spec/classy_hash_spec.rb
. For
complete documentation of all parameters, see lib/classy_hash.rb
.
Simple example
Let's look at a simple schema for a Hash with three members:
schema = {
key1: String,
key2: Integer,
key3: TrueClass
}
This specifies a Hash with a String
, an Integer
, and a boolean value (both
TrueClass
and FalseClass
will accept true
and false
). Here's how we
validate a Hash against our schema:
hash = {
key1: 'A Hash with class',
key2: 0,
key3: false
}
ClassyHash.validate(hash, schema) # Returns true
Here's what happens if we try to validate an invalid Hash:
hash = {
key1: 'A less classy Hash',
key2: 1.25,
key3: 'Also wrong, but not checked'
}
ClassyHash.validate(hash, schema) # Raises ":key2 is not a/an Integer"
The validate
method will raise an exception if validation fails (this can be
changed by passing raise_errors: false
; see below). Validation proceeds
until the first invalid value is found, then ClassyHash::SchemaViolationError
is thrown for that value. Later values are not checked unless you run a full
validation with full: true
.
Controlling validation
The ClassyHash.validate
method accepts several named parameters for
controlling validation. For complete details, see lib/classy_hash.rb
.
Strict validation
You can pass strict: true
as a keyword argument to validate
to raise an
error if the input hash contains any members not specified in the schema.
Passing verbose: true
will include the names of the unexpected hash keys in
the generated error message (a potential security risk in some settings). See
the inline documentation in the source code for more details. As of version
0.2.0, all nested schemas will also be checked for unexpected members.
Example:
# Raises "Top level is not valid: contains members not specified in schema"
ClassyHash.validate({a: 1, b: 2}, {c: Integer}, strict: true)
# Raises "Top level is not valid: contains members :a, :b not specified in schema"
ClassyHash.validate({a: 1, b: 2}, {c: Integer}, strict: true, verbose: true)
# Raises ":a is not valid: contains members :b, :c not specified in schema"
ClassyHash.validate({a: {b: 1, c: 2}}, {a: {a: Integer}}, strict: true, verbose: true)
Full validation
If you'd like to capture all errors, you can pass full: true
. If you don't
also pass raise_errors: false
, full validation will simply raise an error
that includes all the violations in the message:
schema = {
key1: String,
key2: Integer,
key3: TrueClass
}
hash = {
key1: 'A less classy Hash',
key2: 1.25,
key3: 'Also wrong'
}
begin
ClassyHash.validate(hash, schema, full: true) # Raises ":key2 is not a/an Integer, :key3 is not true or false"
rescue => e
puts e.message
# Individual errors are in the .entries array from the exception, just like
# the :errors option described below.
puts e.entries.inspect
end
Errors array, exceptionless validation
If you pass an empty array into :errors
, your application code can handle the
validation errors directly. If you pass raise_errors: false
, .validate
will return false
for failed validations. Only the first error will be added
to the :errors
array, unless you pass full: true
. These options can be
used independently.
# Using schema and hash from the previous example
errors = []
ClassyHash.validate(hash, schema, errors: errors, raise_errors: false, full: true) # Returns false
# Now, errors is [{full_path: ":key2", message: "a/an Integer"}, {full_path: ":key3", message: "true or false"}]
Whether you use exceptions or a false
return (with or without an :errors
array) is up to your preferences. Note that if you use raise_errors: false
,
there is no way to obtain error messages without passing an :errors
array.
Multiple choice constraints
It's possible to specify more than one option for a key, allowing multiple
types and/or nil
to be used as the value for a key:
schema = {
key1: [ NilClass, String, FalseClass ]
}
ClassyHash.validate({ key1: nil }, schema) # Returns true
ClassyHash.validate({ key1: 'Hi' }, schema) # Returns true
ClassyHash.validate({ key1: true }, schema) # Returns true
ClassyHash.validate({ key1: 1337 }, schema) # Raises ":key1 is not one of a/an NilClass, a/an String, true or false"
Optional keys
Classy Hash will raise an error if a key from the schema is missing from the hash being validated:
schema = {
key1: TrueClass
}
ClassyHash.validate({}, schema) # Raises ":key1 is not present"
If we want to allow a key to be omitted, we can mark it as optional by adding
the :optional
symbol as the first element of a multiple choice array:
schema = {
key1: [:optional, TrueClass]
}
ClassyHash.validate({}, schema) # Returns true
Regular expressions
Regexp
constraints, added in 0.1.4, will require values to be String
s that
match a regular expression:
schema = {
key1: /Re.*quired/i
}
ClassyHash.validate({ key1: /required/ }, schema) # Raises ":key1 is not a String matching /Re.*quired/i"
ClassyHash.validate({ key1: 'invalid' }, schema) # Raises ":key1 is not a String matching /Re.*quired/i"
ClassyHash.validate({ key1: 'The regional manager inquired about ClassyHash' }, schema) # Returns true
As with Ruby's =~
operator, Regexp
s can match anywhere in the String
. To
require the entire String
to match, use the standard \A
and \z
anchors:
schema = {
key1: /\AStart.*end\z/
}
ClassyHash.validate({ key1: 'One must Start to end' }, schema) # Raises ":key1 is not a String matching /\\AStart.*end\\z/"
ClassyHash.validate({ key1: 'Start now, continue to the end' }, schema) # Returns true
Ranges
If you want to check more than just the type of a value, you can specify a
Range
as a constraint. If your Range
endpoints are Integer
s, Numeric
s,
or String
s, then Classy Hash will also restrict the type of the value to
Integer
, Numeric
, or String
.
# An Integer range
schema = {
key1: 1..10
}
ClassyHash.validate({ key1: 5 }, schema) # Returns true
ClassyHash.validate({ key1: -5 }, schema) # Raises ":key1 is not an Integer in range 1..10"
ClassyHash.validate({ key1: 2.5 }, schema) # Raises ":key1 is not an Integer in range 1..10"
# A more interesting range -- this use is not recommended :-)
schema = {
key1: [1]..[5]
}
ClassyHash.validate({ key1: [2, 3, 4] }, schema) # Returns true
ClassyHash.validate({ key1: [5, 0] }, schema) # Raises ":key1 is not in range [1]..[5]"
Procs
If nothing else will do, you can pass a Proc
/lambda
. The Proc
should
have no side effects, as it may be called more than once per value. When using
a Proc
, you should accept exactly one parameter and return true
if
validation succeeds. Any other value will be treated as a validation failure.
If the Proc
returns a String
, that string will be used in the error
message.
# A lambda without an error message
schema = {
key1: ->(v){ v.is_a?(Integer) && v.odd? }
}
ClassyHash.validate({ key1: 1 }, schema) # Returns true
ClassyHash.validate({ key1: 2 }, schema) # Raises ":key1 is not accepted by Proc"
# A lambda with an error message
schema = {
key1: ->(v){ (v.is_a?(Integer) && v.odd?) || 'an odd integer' }
}
ClassyHash.validate({ key1: 1 }, schema) # Returns true
ClassyHash.validate({ key1: 2 }, schema) # Raises ":key1 is not an odd integer"
Sets
Added in version 0.2.0, Set
s constrain a value to one of a list of values.
The Set
constraint replaces the enum
generator. Note that a Set
requires
an exact value match, unlike the Multiple Choice constraint or Composite
generator.
schema = {
key1: Set.new([1, 2, 3, 'see?'])
}
ClassyHash.validate({ key1: 1 }, schema) # Returns true
ClassyHash.validate({ key1: 'see?' }, schema) # Returns true
ClassyHash.validate({ key1: 4 }, schema) # Raises ':key1 is not an element of [1, 2, 3, "see?"]'
Nested schemas
Classy Hash accepts nested schemas. You can also use a schema as one of the options in a multiple choice key.
schema = {
key1: {
msg: String
},
key2: {
n1: [Integer, { y: Numeric }]
}
}
hash1 = {
key1: { msg: 'Valid' },
key2: { n1: { y: 1.0 } }
}
hash2 = {
key1: { msg: 'Also valid' },
key2: { n1: -1 }
}
hash3 = {
key1: { msg: false },
key2: { n1: 1 }
}
hash4 = {
key1: { msg: 'Not valid' },
key2: { n1: { y: false } }
}
ClassyHash.validate(hash1, schema) # Returns true
ClassyHash.validate(hash2, schema) # Returns true
ClassyHash.validate(hash3, schema) # Raises ":key1[:msg] is not a/an String"
ClassyHash.validate(hash4, schema) # Raises ":key2[:n1][:y] is not a/an Numeric, :key2[:n1] is not one of a/an Integer, a Hash matching {schema with keys [:y]}"
ClassyHash.validate({ key1: false }, schema) # Raises ":key1 is not a Hash matching {schema with keys [:msg]}"
Complex nested multiple choice constraints can lead to confusing error messages, slower performance, and increased memory consumption. For best results, try to push multiple choice options as deep into the schema as possible, or use your own code to decide which schema to pass to ClassyHash. For example:
hash = { data: { key: 1.0 }}
# Not recommended (confusing errors, slower validation)
bad_schema = {
data: [ { key: String }, { key: Integer } ]
}
# Raises :data[:key] is not a/an String, :data is not one of a Hash matching {schema with keys [:key]}, a Hash matching {schema with keys [:key]}
CH.validate(hash, bad_schema)
# Recommended
good_schema = {
data: { key: [ String, Integer ] }
}
# Raises :data[:key] is not a/an String, :data[:key] is not one of a/an String, a/an Integer
CH.validate(hash, good_schema)
# Alternative
schema1 = {
data: { key: String }
}
schema2 = {
data: { key: Integer }
}
# Raises ":data[:key] is not a/an Integer" or ":data[:key] is not a/an String"
CH.validate(hash, api_v2 ? schema2 : schema1)
Arrays
You can use Classy Hash to validate the members of an array. Array constraints are specified by double-array-wrapping a multiple choice list. Array constraints can also themselves be part of a multiple choice list or array constraint. Empty arrays are always accepted by array constraints.
If the error messages are too verbose, you can pass in an :errors
array or
retrieve the first entry from the exception's .entries
. Typically the first
entry will be the most useful, but this is not guaranteed.
# Simple array of integers
schema = {
key1: [[Integer]]
}
ClassyHash.validate({ key1: [] }, schema) # Returns true
ClassyHash.validate({ key1: [1, 2, 3, 4, 5] }, schema) # Returns true
ClassyHash.validate({ key1: [1, 2, 3, 0.5] }, schema) # Raises ":key1[3] is not a/an Integer, :key1[3] is not one of a/an Integer"
ClassyHash.validate({ key1: false }, schema) # Raises ":key1 is not an Array of one of a/an Integer"
# An integer, or an array of arrays of strings
schema = {
key1: [Integer, [[ [[ String ]] ]]]
}
ClassyHash.validate({ key1: 1 }, schema) # Returns true
ClassyHash.validate({ key1: [ [], ['a'], ['b', 'c'] ] }, schema) # Returns true
# Raises :key1[0] is not an Array of one of a/an String, :key1[0] is not an
# Array of an Array of one of a/an String, :key1 is not one of a/an Integer, an
# Array of an Array of an Array of one of a/an String
ClassyHash.validate({ key1: ['bad'] }, schema)
# Raises :key1[2][1] is not a/an String, :key1[2][1] is not one of a/an String,
# :key1[2] is not an Array of an Array of one of a/an String, :key1 is not one
# of a/an Integer, an Array of an Array of an Array of one of a/an String
ClassyHash.validate({ key1: [ [], ['a'], ['b', false] ] }, schema)
If you want to check the length of an array, you can use a Proc
(also see
CH::G.array_length
in the Generators section below):
# An array of two integers
schema = {
key1: ->(v){
if v.is_a?(Array) && v.length == 2
begin
ClassyHash.validate({k: v}, {k: [[Integer]]})
true
rescue => e
"valid: #{e}"
end
else
"an array of length 2"
end
}
}
ClassyHash.validate({ key1: [1, 2] }, schema) # Returns true
ClassyHash.validate({ key1: [1, false] }, schema) # Raises ":key1 is not valid: :k[1] is not one of a/an Integer"
ClassyHash.validate({ key1: [1] }, schema) # Raises ":key1 is not an array of length 2"
Generators
Version 0.1.1 of Classy Hash introduces some helper methods in
ClassyHash::Generate
(or the CH::G
alias introduced in 0.1.2) that will
generate a constraint for common tasks that are difficult to represent in the
base Classy Hash syntax.
Composite and negated constraints
You can combine multiple constraints in an AND or NAND fashion using the
Composite generators, .all
and .not
. Because composite constraints
can be complex and confusing, they should be used only when other
approaches would be more complex and confusing. Composite constraints
were added in version 0.2.0.
The .all
generator requires all constraints to pass.
schema = {
key1: CH::G.all(Integer, 1.0..100.0)
}
ClassyHash.validate({ key1: 5 }, schema) # Returns true
ClassyHash.validate({ key1: BigDecimal(5) }, schema) # Raises ":key1 is not all of [one of a/an Integer, a Numeric in range 1.0..100.0]"
The .not
generator requires all constraints to fail.
schema = {
key1: CH::G.not(Rational, BigDecimal, 'a'..'c', 10..15)
}
ClassyHash.validate({ key1: 5 }, schema) # Returns true
ClassyHash.validate({ key1: 10 }, schema) # Raises ':key1 is not none of [one of a/an Rational, a/an BigDecimal, a String in range "a".."c", an Integer in range 10..15]'
ClassyHash.validate({ key1: Rational(3, 5) }, schema) # Raises ':key1 is not none of [one of a/an Rational, a/an BigDecimal, a String in range "a".."c", an Integer in range 10..15]'
ClassyHash.validate({ key1: 'Good' }, schema) # Returns true
ClassyHash.validate({ key1: 'broken' }, schema) # Raises ':key1 is not none of [one of a/an Rational, a/an BigDecimal, a String in range "a".."c", an Integer in range 10..15]'
The .all
and .not
generators become more useful when combined:
schema = {
# Note: this case could also be represented as key1: [1..9, 21..100]
# Also note that Float ranges are used because ClassyHash only accepts
# Integer values for Integer ranges; this is important for .not().
key1: CH::G.all(Integer, 1.0..100.0, CH::G.not(10.0..20.0))
}
ClassyHash.validate({ key1: 9 }, schema) # Returns true
ClassyHash.validate({ key1: 10 }, schema) # Raises :key1 is not all of [one of a/an Integer, a Numeric in range 1.0..100.0, none of [one of a Numeric in range 10.0..20.0]]
ClassyHash.validate({ key1: 25.0 }, schema) # Raises :key1 is not all of [one of a/an Integer, a Numeric in range 1.0..100.0, none of [one of a Numeric in range 10.0..20.0]]
Note that .not
may accept a value for reasons you don't expect, since
its parameters are treated as ordinary ClassyHash constraints, and only
requires that its constraints raise some kind of error. For example,
CH::G.not(5..10)
will allow 6.0
but not 6
.
Enumeration
As of version 0.2.0, the enum
generator is a deprecated compatibility method
that generates a Set
. See the above documentation for Set
constraints.
# Enumerator -- value must be one of the elements provided
schema = {
key1: CH::G.enum(1, 2, 3, 4)
}
ClassyHash.validate({ key1: 1 }, schema) # Returns true
ClassyHash.validate({ key1: -1 }, schema) # Raises ":key1 is not an element of [1, 2, 3, 4]"
Arbitrary length
The arbitrary length generator checks the length of any type that responds to
:length
.
# Simple length generator -- length of value must be equal to a value, or
# within a range
schema = {
key1: CH::G.length(5..6)
}
ClassyHash.validate({ key1: '123456' }, schema) # Returns true
ClassyHash.validate({ key1: {a: 1, b: 2, c: 3, d: 4, e: 5} }, schema) # Returns true
ClassyHash.validate({ key1: [1, 2] }, schema) # Raises ":key1 is not of length 5..6"
ClassyHash.validate({ key1: 5 }, schema) # Raises ":key1 is not a type that responds to :length"
String or Array length
Since checking the length of a String
or an Array
is very common, there are
generators that will verify a value is the correct type and the correct
length.
# String length generator
schema = {
key1: CH::G.string_length(0..15)
}
ClassyHash.validate({ key1: 'x' * 15 }, schema) # Returns true
ClassyHash.validate({ key1: 'x' * 16 }, schema) # Raises ":key1 is not a String of length 0..15"
ClassyHash.validate({ key1: false }, schema) # Raises ":key1 is not a String of length 0..15"
The Array
length constraint generator also checks the values of the array.
# Array length generator
schema = {
key1: CH::G.array_length(4, Integer, String)
}
ClassyHash.validate({ key1: [1, 'two', 3, 4] }, schema) # Returns true
ClassyHash.validate({ key1: [1, 2, false, 4] }, schema) # Raises ":key1 is not valid: :array[2] is not a/an Integer, :array[2] is not one of a/an Integer, a/an String"
ClassyHash.validate({ key1: false }, schema) # Raises ":key1 is not an Array of length 4"
A practical example (user and address)
Here's a more practical application of Classy Hash. Suppose you have an API
that accepts POSTs containing JSON user data, that you convert to a Ruby
Hash with something like JSON.parse(data, symbolize_names: true)
. The user
should have some basic fields, an array of addresses, and no extra fields.
Here's how you might use Classy Hash to validate your user objects and generate helpful error messages:
# Note: this is not guaranteed to be a useful address checking schema.
address_schema = {
street1: /[0-9]+/,
street2: [NilClass, String],
city: String,
state: [NilClass, String],
country: String,
postcode: [NilClass, Integer, String]
}
user_schema = {
id: Integer,
name: String,
email: ->(v){ (v.is_a?(String) && v.include?('@')) || 'an e-mail address' },
addresses: [[ address_schema ]]
}
data = <<JSON
{
"id": 1,
"name": "",
"email": "@",
"addresses": [
{
"street1": "123 Fake Street",
"street2": null,
"city": "",
"state": "",
"country": "",
"postcode": ""
},
{
"street1": "Building 53",
"street2": "",
"city": 5
}
]
}
JSON
# Raises "Top level is not valid: contains members not specified in schema"
ClassyHash.validate({ :extra_key => 0 }, user_schema, strict: true)
# Raises :addresses[1][:city] is not a/an String, :addresses[1] is not one of a
# Hash matching {schema with keys [:street1, :street2, :city, :state, :country,
# :postcode]}
ClassyHash.validate(JSON.parse(data, symbolize_names: true), user_schema, strict: true)
Testing
Classy Hash includes extremely thorough RSpec tests:
# Execute within a clone of the classy_hash Git repository:
bundle install
rspec
Who wrote it?
Classy Hash was written by Mike Bourgeous for API validation and documentation in internal DeseretBook.com systems, and subsequently enhanced by inside and outside contributors. See the Git history for details.
Alternatives
If you decide Classy Hash isn't for you, here are some of the other options we considered before deciding to roll our own:
License
Classy Hash is released under the MIT license (see the LICENSE
file for the
license text and copyright notice, and the git history for more contributors).