ActiveLoaders
- Automatically preload associations for your serializers
- Specify custom SQL snippets for virtual attributes (Query attributes)
- Write custom preloading logic in a reusable way
Note: the API of this gem is still unstable and may change between versions. This project uses semantic versioning, however until version 1.0.0, minor version (MAJOR.MINOR.PATCH) changes may include API changes, but patch version will not)
A 30-min talk about Datasource
Install
Ruby version requirement:
- MRI 2.0 or higher
- JRuby 9000
Supported ORM:
- ActiveRecord
- Sequel
Add to Gemfile (recommended to use github version until API is stable)
gem 'active_loaders', github: 'kundi/active_loaders'
bundle install
rails g datasource:install
Upgrade
rails g datasource:install
Introduction
The most important role of ActiveLoaders is to help prevent and fix the N+1 queries problem when using Active Model Serializers.
This gem depends on the datasource gem that handles actual data loading. What this gem adds on top of it is integration with Active Model Serializers. It will automatically read your serializers to make datasource preload the necessary associations. Additionally it provides a simple DSL to configure additional dependencies and test helpers to ensure your queries are optimized.
ActiveLoaders will automatically recognize associations in your serializer when you use
the has_many
or belongs_to
keywords:
class PostSerializer < ActiveModel::Serializer
belongs_to :blog
has_many :comments
end
In this case, it will then look in your BlogSerializer and CommentSerializer to properly load them as well (so it is recursive).
When you are using loaded values (explained below), ActiveLoaders will automatically
use them if you specify the name in attributes
. For example if you have a
loaded :comment_count
it will automatically be used if you have
attributes :comment_count
in your serializer.
In case ActiveLoaders doesn't automatically detect something, you can always manually specify it in your serializer using a simple DSL.
A test helper is also provided which you can ensure that your serializers don't produce N+1 queries.
Associations
The most noticable magic effect of using ActiveLoaders is that associations will automatically be preloaded using a single query.
class PostSerializer < ActiveModel::Serializer
attributes :id, :title
end
class UserSerializer < ActiveModel::Serializer
attributes :id
has_many :posts
end
SELECT users.* FROM users
SELECT posts.* FROM posts WHERE id IN (?)
This means you do not need to call includes
yourself. It will be done
automatically.
Manually include
In case you are not using has_many
or belongs_to
in your serializer but you are
still using the association (usually when you do not embed the association), then you
need to manually specify this in your serializer. There are two options depending on
what data you need.
includes: use this when you just need a simple includes
, which behaves the same
as in ActiveRecord.
class UserSerializer < ActiveModel::Serializer
attributes :id, :post_titles
loaders do
includes :posts
# includes posts: { :comments }
end
def post_titles
object.posts.map(&:title)
end
end
select: use this to use the serializer loading logic - the same recursive logic that
happens when you use has_many
or belongs_to
. This will also load associations and
loaded values (unless otherwise specified).
class UserSerializer < ActiveModel::Serializer
attributes :id, :comment_loaded_values
loaders do
select :posts
# select posts: [:id, comments: [:id, :some_loaded_value]]
end
def comment_loaded_values
object.posts.flat_map(&:comments).map(&:some_loaded_value)
end
end
class PostSerializer < ActiveModel::Serializer
attributes :id
has_many :comments
end
class CommentSerializer < ActiveModel::Serializer
attributes :id, :some_loaded_value
end
Query attribute
You can specify a SQL fragment for SELECT
and use that as an attribute on your
model. This is done through the datasource gem DSL. As a simple example you can
concatenate 2 strings together in SQL:
class User < ActiveRecord::Base
datasource_module do
query :full_name do
"users.first_name || ' ' || users.last_name"
end
end
end
class UserSerializer < ActiveModel::Serializer
attributes :id, :full_name
end
SELECT users.*, (users.first_name || ' ' || users.last_name) AS full_name FROM users
Note: If you need data from another table, use a loaded value.
Refactor with standalone Datasource class
If you are going to have more complex preloading logic (like using Loaded below),
then it might be better to put Datasource code into its own class. This is pretty
easy, just create a directory app/datasources
(or whatever you like), and create
a file depending on your model name, for example for a Post
model, create
post_datasource.rb
. The name is important for auto-magic reasons. Example file:
class PostDatasource < Datasource::From(Post)
query(:full_name) { "users.first_name || ' ' || users.last_name" }
end
This is completely equivalent to using datasource_module
in your model:
class Post < ActiveRecord::Base
datasource_module do
query(:full_name) { "users.first_name || ' ' || users.last_name" }
end
end
Loaded
You might want to have some more complex preloading logic. In that case you can use a method to load values for all the records at once (e.g. with a custom query or even from a cache). The loading methods are only executed if you use the values, otherwise they will be skipped.
First just declare that you want to have a loaded attribute (the parameters will be explained shortly):
class UserDatasource < Datasource::From(User)
loaded :post_count, from: :array, default: 0
end
By default, datasource will look for a method named load_<name>
for loading
the values, in this case load_newest_comment
. It needs to be defined in the
collection block, which has methods to access information about the collection (posts)
that are being loaded. These methods are scope
, models
, model_ids
,
datasource
, datasource_class
and params
.
class UserDatasource < Datasource::From(User)
loaded :post_count, from: :array, default: 0
collection do
def load_post_count
Post.where(user_id: model_ids)
.group(:user_id)
.pluck("user_id, COUNT(id)")
end
end
end
In this case load_post_count
returns an array of pairs.
For example: [[1, 10], [2, 5]]
. Datasource can understand this because of
from: :array
. This would result in the following:
post_id_1.post_count # => 10
post_id_2.post_count # => 5
# other posts will have the default value or nil if no default value was given
other_post.post_count # => 0
Besides default
and from: :array
, you can also specify group_by
, one
and source
. Source is just the name of the load method.
The other two are explained in the following example.
class PostDatasource < Datasource::From(Post)
loaded :newest_comment, group_by: :post_id, one: true, source: :load_newest_comment
collection do
def load_newest_comment
Comment.for_serializer.where(post_id: model_ids)
.group("post_id")
.having("id = MAX(id)")
end
end
end
In this case the load method returns an ActiveRecord relation, which for our purposes
acts the same as an Array (so we could also return an Array if we wanted).
Using group_by: :post_id
in the loaded
call tells datasource to group the
results in this array by that attribute (or key if it's an array of hashes instead
of model objects). one: true
means that we only want a single value instead of
an array of values (we might want multiple, e.g. newest_10_comments
).
So in this case, if we had a Post with id 1, post.newest_comment
would be a
Comment from the array that has post_id
equal to 1.
In this case, in the load method, we also used for_serializer
, which will load
the Comment
s according to the CommentSerializer
.
Note that it's perfectly fine (even recommended) to already have a method with the same name in your model. If you use that method outside of serializers/datasource, it will work just as it should. But when using datasource, it will be overwritten by the datasource version. Counts is a good example:
class User < ActiveRecord::Base
has_many :posts
def post_count
posts.count
end
end
class UserDatasource < Datasource::From(User)
loaded :post_count, from: :array, default: 0
collection do
def load_post_count
Post.where(user_id: model_ids)
.group(:user_id)
.pluck("user_id, COUNT(id)")
end
end
end
class UserSerializer < ActiveModel::Serializer
attributes :id, :post_count # <- post_count will be read from load_post_count
end
User.first.post_count # <- your model method will be called
Params
You can also specify params that can be read from collection methods. The params
can be specified when you call render
:
# controller
render json: posts,
loader_params: { include_newest_comments: true }
# datasource
loaded :newest_comments, default: []
collection do
def load_newest_comments
if params[:include_newest_comments]
# ...
end
end
end
Debugging and logging
Datasource outputs some useful logs that you can use debugging. By default the log level is
set to warnings only, but you can change it. You can add the following line at the end of your
config/initializers/datasource.rb
:
Datasource.logger.level = Logger::INFO unless Rails.env.production?
You can also set it to DEBUG
for more output. The logger outputs to stdout
by default. It
is not recommended to have this enabled in production (simply for performance reasons).
Using manually
When using a serializer, ActiveLoaders should work automatically. If for some reason
you want to manually trigger loaders on a scope, you can call for_serializer
.
Post.for_serializer.find(params[:id])
Post.for_serializer(PostSerializer).find(params[:id])
Post.for_serializer.where("created_at > ?", 1.day.ago).to_a
You can also use it on an existing record, but you must use the returned value (the record may be reloaded e.g. if you are using query attributes).
user = current_user.for_serializer
For even more advanced usage, see Datasource gem documentation.
Testing your serializer queries
ActiveLoaders provides test helpers to make sure your queries stay optimized. By default it expects there to be no N+1 queries, so after the initial loading of the records and associations, there should be no queries from code in the serializers. The helpers raise and error otherwise, so you can use them with any testing framework (rspec, minitest). You need to put some records into the database before calling the helper, since it is required to be able to test the serializer.
test_serializer_queries(serializer_class, model_class, options = {})
Here is a simple example in rspec with factory_girl:
require 'spec_helper'
require 'active_loaders/test'
context "serializer queries" do
include ActiveLoaders::Test
let(:blog) { create :blog }
before do
2.times {
create :post, blog_id: blog.id
}
end
it "should not contain N+1 queries" do
expect { test_serializer_queries(BlogSerializer, Blog) }.to_not raise_error
end
# example if you have N+1 queries and you can't avoid them
it "should contain exactly two N+1 queries (two queries for every Blog)" do
expect { test_serializer_queries(BlogSerializer, Blog, allow_queries_per_record: 2) }.to_not raise_error
end
end
Columns check
Recently (not yet released as of Rails 4.2), an accessed_fields
instance method
was added to ActiveRecord models. ActiveLoaders can use this information in your
tests to determine which attributes you are not using in your serializer. This check
is skipped if your Rails version doesn't support accessed_fields
.
Let's say your are not using User#payment_data in your serializer. You have this test:
it "should not contain N+1 queries" do
expect { test_serializer_queries(UserSerializer, User) }.to_not raise_error
end
Then this test will fail with instructions on how to fix it:
ActiveLoaders::Test::Error:
unnecessary select for User columns: payment_data
Add to UserSerializer loaders block:
skip_select :payment_data
Or ignore this error with:
test_serializer_queries(UserSerializer, User, ignore_columns: [:payment_data])
Or skip this columns check entirely:
test_serializer_queries(UserSerializer, User, skip_columns_check: true)
The instructions should be self-explanatory. Choosing the first option:
class UserSerializer < ActiveModel::Serializer
attributes :id, :title
loaders do
skip_select :payment_data
end
end
Would then produce an optimized query:
SELECT users.id, users.title FROM users
Getting Help
If you find a bug, please report an Issue.
If you have a question, you can also open an Issue.
Contributing
- Fork it ( https://github.com/kundi/active_loaders/fork )
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create a new Pull Request