Candy
"Mongo like candy!" -- Blazing Saddles
Candy's goal is to provide the simplest possible object persistence for the MongoDB database. By "simple" we mean "nearly invisible." Candy doesn't try to mirror ActiveRecord or DataMapper. Instead, we play to MongoDB's unusual strengths -- extremely fast writes and a set of field-specific update operators -- and do away with the cumbersome, unnecessary methods of last-generation workflows.
Methods like find
.
Or save
.
Overview
When you mix the Candy::Piece module into a class, the class gains a Mongo collection as an alter ego. Objects are saved to Mongo the first time you set a property. Any property you set thereafter is sent to Mongo immediately and atomically. You don't need to declare the properties; we use method_missing
to drive the getting and setting of any field you want in any record. Or you can use the hashlike []
and []=
operators if that's more in your comfort zone.
class Person
include Candy::Piece
end
me = Person.new
me.last_name = 'Eley' # New record created and saved to Mongo
me.id # => ObjectID(4bb606f9609c8417cf00004b) or thereabouts
me[:height] = 67 # Or me.height = 67 -- either way, updates with a Mongo $set
Embedded Documents
We got 'em. Candy pieces can contain each other recursively, to any arbitrary depth. There's no need for complex has_and_belongs_to_many :through {:your => 'mother'}
type declarations. Just assign an object or a bunch of objects to a field. Hashes and arrays become Candy-aware analogues of themselves (CandyHash and CandyArray) with live updating and the same recursive embedding. Non-Candy objects are serialized into a flat hash structure that retains their class and instance variables, so they can be rehydrated later.
me.favorites = { composer: 'Yoko Kanno',
seafood: 'Maryland blue crabs',
scotch: ['Glenmorangie Port Wood Finish',
'Balvenie Single Barrel']}
me.spouse = Person.piece(first_name: 'Anna', eyes: :blue)
me.spouse.eyes # => :blue
me.favorites.scotch[1] # => 'Balvenie Single Barrel'
Retrieval
Again, transparency is the key. The same method_missing
tactic applies to class methods to retrieve individual records:
Person.last_name('Smith') # Returns the first Smith
Person.age(21) # Returns the first legal drinker (in the U.S.)
Person(12345) # Returns the person with an _id of 12345
Take note of that last example. It's moderately deep magic, and we take care not to stomp on any class-like methods you've already defined. But it's the simplest possible way to retrieve a record by ID. Person.first('_id' => 12345)
works too, of course.
Collections
Some applications don't need to iterate through all records of a query; you might just need the first record from a queue or something. When you do need them, the anonymous "sort of like an array, except when it isn't" encapsulation of collections in other ORMs is clunky and confusing. So enumerable cursors live in their own Candy::Collection module, which you explicitly mix into a class and then link back to the Candy::Piece class:
class People
include Candy::Collection
collects :person # Declares the Mongo collection is 'Person'
end # (and so is the Candy::Piece class)
People.last_name('Smith') # Returns an enumeration of all Smiths
People.age(19).sort(:birthdate, :down).limit(10) # We can chain options
People(limit: 47, occupation: :ronin) # Or People.all(params) or People.new(params)
People.each(|p| p.shout = 'Norm!') # Where everybody knows your name...
You can also, of course, just do People.new()
with a bunch of query conditions. You don't need two separate hashes for your fields and your Mongo options; Candy knows which keys are MongoDB query options and will automatically separate them for you. The collection module is really just a thin wrapper around a Mongo::Cursor and passes most of its behavior to the cursor -- so you can do each
, next
, et cetera.
Q: Why can't I just have Person automatically link to People? I want my Raaaaails!
A: Because including ActiveSupport as a dependency would be nuts, whereas pasting in my own table of plural inflections would merely double the code base. I'm not against magic, obviously, but that's expensive magic for little benefit. You'll just have to type those three lines of code yourself.
Prerequisites
-
Ruby 1.9.x The code uses the new hash syntax, UTF-8 encoding, and 1.9ish enumerable methods. No whining. If you're starting a new project in mid-2010 or later and you're still using 1.8, you're hurting us all. And kittens. You don't want to hurt kittens, do you?
-
MongoDB 1.4+ You could probably get away with 1.2 for some functionality, but the new array operators and findAndModify were too useful to pass up. It's a safe and easy upgrade, so if you're not on the latest Mongo yet... Well, you're not hurting kittens, but you're hurting yourself.
-
mongo gem 0.19+ The Ruby gem seems to lag behind actual Mongo development by quite a bit sometimes. 0.19.1 is the latest at the time of this writing, and some commands (e.g.
findAndModify
) have been implemented in Candy because the gem doesn't have methods for them yet. We'll continue to streamline our code as the driver allows.
Installation
Come on, you've done this before:
$ sudo gem install candy
(Or leave off the sudo if you're smart enough to be using RVM.)
Configuration
The simplest possible thing that works:
class Zagnut
include Candy::Piece
end
That's it. Honest. Some Mongo plumbing is hooked in and instantiated the first time the .collection
attribute is accessed:
Zagnut.connection # => Defaults to localhost port 27017
Zagnut.db # => Defaults to your username, or 'candy' if unknown
Zagnut.collection # => Defaults to the class name ('Zagnut')
You can override the DB or collection by providing name strings or Mongo::DB and Mongo::Collection objects. Or you can set certain module-level properties to make it easier for multiple Candy classes in an application to use the same database:
- Candy.host
- Candy.port
- Candy.connection
- Candy.connection_options (A hash of options to the Connection object)
- Candy.db (Can provide a string or a database object)
All of the above is pretty general-purpose. If you want to use this class-based Mongo functionality in your own projects, simply include Candy::Crunch
in your own classes.
Using It
The trick here is to think of Candy objects like OpenStructs. Or if that's too technical, imagine the objects as thin candy shells around a chewy method_missing
center:
class Zagnut
include Candy::Piece
end
zag = Zagnut.new # A blank document enters the Zagnut collection
zag.taste = "Chewy!" # Properties are created and saved as they're used
zag.calories = 600
nut = Zagnut.taste ("Chewy!") # Or Zagnut(taste: 'Chewy!')
nut.calories # => 600
kingsize = Zagnut.new
kingsize.calories = 900 # Or kingsize[:calories] = 900
kingsize.ingredients = ['cocoa', 'peanut butter']
kingsize.ingredients << ['corn syrup']
kingsize.nutrition = { sodium: '115mg', protein: '3g' }
kingsize.nutrition.fat = {saturated: '4g', total: '9g'}
kingsize[:nutrition][:fat][:saturated] # => '4g'
class Zagnuts
include Candy::Collection
collects Zagnut
end
bars = Zagnuts # Or Zagnuts.all or Zagnuts.new
bar.count # => 2
sum = Zagnuts.inject {|sum,bar| sum + bar.calories} # => 1500
Note that writes are always live, but reads hold onto the retrieved document and cache its values to avoid query delays. You can force a requery at any time with the refresh
method. (An expiration feature wherein documents are requeried after a set time has elapsed is being considered for the future.)
Advanced Classes
Candy properties are fundamentally just entries in a hash, with some hooks to the MongoDB $set
updater when something changes. The primary reason we've implemented Candy as modules is so that you keep control of your own classes' behavior and inheritance. To have properties that don't store to the Mongo collection, all you have to do is define them explicitly:
class Weight
include Candy::Piece
attr_accessor :gravity # This won't be stored in MongoDB
def kilograms
pounds * 2.2046 # 'pounds' is undeclared, so Candy retrieves it
end
def kilograms=(val)
self.pounds = val/2.2046 # 'pounds=' is undeclared; Candy stores it
end
end
Embedded hashes are of type CandyHash unless you explicitly assign an object that includes Candy::Piece. (CandyHash itself is really just a Candy piece that doesn't store its classname.) If you want truly quick-and-dirty persistence, you can even just use a CandyHash as a standalone object and skip creating your own classes:
hash = CandyHash.new(foo: 'bar')
hash[:yoo] = :yar # Persists to the 'candy' collection by default
hash.too = [:tar, :car, :far]
hash2 = CandyHash(hash.id)
hash2.foo # => 'bar'
hash2.yoo # => :yar
hash2[:too][1] # => :car
Embedded arrays are of type CandyArray. Unlike CandyHashes, CandyArrays do not include Candy::Piece and cannot operate as standalone objects. They only make sense when embedded in a Candy piece. That's just the way Mongo works.
Good Practices
(I'm not going to call them "Best" practices because you might think of better ones than me.)
Validations
The long-term plan includes support for ActiveModel features as an optional extension. In the meantime, one simple trick is to decorate your properties:
class Person include Candy::Piece
def email=(val)
raise "Invalid email address!" if val !~ /^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}$/
super
end
end
Exceptions for validation failures? Why yes. One consequence of a doctrine of instant persistence is that not persisting connotes something's wrong. I personally prefer rescues to deeply nested 'else' clauses to handle failures anyway. If you disagree, you could implement the above in a kinder, gentler way:
def email=(val)
if val =~ /^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}$/
super
else
(@errors ||= []) << "Invalid email address!"
false # (Or nil, but that gets confusing if you can actually assign nil.)
end
end
If that seems like a lot of work compared to validates_format_of
-- well, you're right. As I said, it's coming. In the meantime, just don't validate frivolously. Leave minor tips and cleanups to the interface layer, and only refuse to save in cases where allowing the value would break something.
Named Scopes
Dirt simple. Use a method in the collection class.
class People
include Candy::Collection
collects :person
def voters
age('$gt' => 18).citizen(true)
end
end
Required Fields
Instant document creation makes it problematic to wait to see if a user's going to fill in a field. One tactic is to require those fields in the constructor:
class Person
include Candy::Piece
def initialize(options={})
raise "Last name is required!" unless options[:last_name]
raise "Email address is required!" unless options[:email]
super
end
end
p = Person.new(last_name: 'Eley', email: 'sfeley@gmail.com') # This is valid
Or you can use the above scoping trick to make sure your application's standard collections operate only on records that are "complete." You could even make it intrinsic to the class itself:
class ValidPeople < People
def initialize(options={})
options.merge!(last_name: {'$exists' => true}, email: {'$exists' => true})
super
end
end
Philosophy
Even relative to other ORMs, Candy's pretty opinionated. Here are some of the opinions behind the design.
-
Applications should be beautiful.
-
In a beautiful application, most of the code clearly and obviously furthers primary activities. (Business needs, use cases, user stories, the critical path, call it whatever you want.)
-
Data storage is not a primary activity. It's a supporting activity. It's something you have to do so that the primary activities you do today remain done tomorrow.
-
An application structure which reflects the constraints of supporting activities more than the achievement of primary activities is not beautiful.
-
Current Ruby ORMs go a long way towards eliminating the boilerplate cruft found in other languages' frameworks. But they don't go far enough.
-
Save
sucks. The 'write-now-commit-later' pattern of most ORMs creates brittleness and uncertainty. If I change the state of something, I want it changed. My framework shouldn't hold its breath waiting for me to say "Simon Says." -
Frameworks should be beautiful.
-
A beautiful framework is one whose fundamentals can be read and understood by a journeyman developer with appropriate background knowledge in one sitting.
-
Frameworks that are too large or complex to be beautiful can sometimes be broken down into smaller frameworks that are beautiful.
-
Beautiful frameworks should be transparent and non-constraining. A truly beautiful framework is one you have to squint to see.
-
Magic (defined as "behavior whose workings are not immediately apparent") is fine in a framework. In a sense it's what frameworks are. But magical behavior should be restrained, consistent, clearly documented, and must not violate the Principle of Least Surprise.
-
Thomas Jefferson wrote that software frameworks should be subject to revolution every couple of years. ("The tree of agility must be refreshed from time to time with the blood of senior architects and project managers. It is its natural manure.") I will be disappointed if Candy is not roundly decried for being too complex, bloated, and intrusive by 2013 at the latest.
-
There is a finite amount of seriousness allowed in any open source project. A project that takes itself too seriously is using up its reserves, and is less likely to be taken seriously by its user base.
-
A project that appeals to playfulness is more likely to be explored with vigor, and if it withstands the exploration, creates passion.
-
A README with this many bullet points in a single list should probably stop.
Caveats, Limitations, and To-Dos
This is very, very alpha software. I'm using it in some non-trivial projects right now, but it's far from bulletproof, and a lot of things aren't implemented yet. In particular:
-
The API is still in flux and subject to overhauling, undermining, or carpetbombing at any time. Candy v0.2, for instance, has barely a wisp of resemblance to Candy v0.1. (My apologies to the 155 of you who downloaded 0.1.)
-
CandyHashes and CandyArrays don't yet implement the full set of methods you'd expect from hashes and arrays. I mean to flesh them out to make them more compatible. (You can help by creating issues to tell me what methods you need most.)
-
Collections are not terribly robust nor well-tested yet. They 'work' in the sense that they pass a bunch of things to Mongo::Cursor, but I personally consider the cursor functionality to be a bit wonky. I'd like to make enumerations more repeatable and have the cursors more certain to be released after garbage collection.
-
Currently every property assignment is a separate write to the database (mostly using $set.) This is fine, but for cases where a lot of properties are set at once we will eventually have transaction-like behavior using blocks.
-
Many Mongo update operators, such as $pushAll and $pop and $addToSet, are not implemented yet or are not fully leveraged. (Saving a full document isn't implemented either, but that's a deliberate feature.)
-
For high-concurrency use cases or for huge documents, more control of the document caching is called for. I'd like to have an option to declare only certain fields to be retrieved by default, and have the internal cache expire after a set time or clear itself on every read.
-
'Safe mode' is never used. Making it an option for classes or specific updates would be...well...safer.
-
There's no support yet for deleting records, apart from driver calls on the class's collection. Somebody might want to someday.
-
Index creation is currently left as an exercise to be performed out-of-band. I do believe a proper persistence framework, even a transparent one, should have some facility for it.
-
Likewise, there's no way yet to set interesting collection options (capped collections, etc.) except to make the Mongo::Collection object separately and hand it to the class.
-
For that matter, capped collections haven't been tested at all and might operate weirdly if properties are continually being set on them.
-
I have only begun optimizing for code beauty, and have not optimized at all yet for performance. Mongo is fast. I make no guarantees that my code is fast at this time.
-
I haven't tested it at all in Windows. Witness my regret. (Wait, there isn't any.)
-
This library isn't thread-safe yet. (Which is to say: I haven't tried to confirm one way or the other, but I'd be shocked if it was.)
-
There's no support yet for ActiveModel or similar validations, et cetera. It's on my list to create an extension system, with Rails 3 and ActiveModel support being the first use case. Right now this is more of a Sinatra sort of data thingy than a Rails data thingy.
Resources
We have the usual array of stuff for your learning pleasure...
- Home page: http://github.com/SFEley/candy
- Documentation: http://rdoc.info/projects/SFEley/candy
- Report issues: http://github.com/SFEley/candy/issues
- Discussion list: http://groups.google.com/group/candy-users
Contributing
At this early stage, one of the best things you could do is just to tell me that you have an interest in using this thing. Join the discussion list and let us know what you think.
Beyond that, report issues, please. If you want to fork it and add features, fabulous. Send me a pull request.
Oh, and if you like science fiction stories, check out my podcast Escape Pod. End of plug.
License
This project is licensed under the Don't Be a Dick License, version 0.1, and is copyright 2010 by Stephen Eley. See the LICENSE.markdown file for elaboration on not being a dick. (But you probably already know.)