The schema is ...
the_schema_is
is a model schema annotation DSL for ActiveSupport models, enforced by Rubocop. Jump to detailed description →.
Why annotate?
An important part of class' public interface is what attributes objects of this class have. In ActiveRecord, attributes are inferred from DB columns and only can be seen in db/schema.rb
, which is unfortunate.
We believe it should be part immediately available information of class definition. "It is drawn automatically from DB" is kinda clever, but it does not helps to read the code. "Auto-deduction from DB" could be used to compare actual table content's to the definition in Ruby but not to skip the definition.
Fun fact: most of other languages' ORM have chosen "explictly list attributes in the model" approach, for some reason! For example, Python's Django, Elixir's Ecto, Go's Beego and Gorm, Rust's Diesel, most of popular NodeJS's options, and PHP's Symphony (but, to be honest, not Laravel).
Well then, why not annotate gem?
Annotate gem provides a very powerful and configurable CLI/rake task which allows adding to your model (and factory/route/spec) files comment looking like...
# == Schema Information
#
# Table name: users
#
# id :integer not null, primary key
# email :string default(""), not null
# encrypted_password :string default(""), not null
# last_sign_in_at :datetime
# last_sign_in_ip :inet
# created_at :datetime not null
# updated_at :datetime not null
# ....
It kinda achieves the goal, but in our experience, it also brings some problems:
- annotation regeneration is disruptive, just replacing the whole block with a new one, which produces a lot of "false changes" (e.g. one field with a bit longer name was added → spacing of all fields were changed);
- if on different developer's machines column order or defaults is different in dev. DB, annotate also decides to rewrite all the annotations, sometimes adding tens files "changed" to PR;
- regeneration makes it hard to use schema annotation for commenting/explaining some fields: because regeneration will lose them, and because comments-between-comments will be hard to distinguish;
- the syntax of annotations is kinda ad-hoc, which makes it harder to add them by hand, so regeneration becomes the only way to add them.
So, how your approach is different?..
the_schema_is
allows you to do this:
class User < ApplicationRecord
the_schema_is "users" do |t|
t.string "email", default: "", null: false
t.string "encrypted_password", null: false
t.datetime "last_sign_in_at"
t.inet "last_sign_in_ip"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
# ...
end
end
Idea is, it is exactly the same DSL that db/schema.rb
uses, so:
- it can be just copied from there (or written by hands in usual migration syntax);
- it is code, which can be supplemented with comments explaining what some column does, or why the defaults are this way; it also can be structured with columns reordering and extra blank lines.
So, in reality, your annotation may look like this:
class User < ApplicationRecord
the_schema_is "users" do |t|
t.string "email", default: "", null: false
# We use RSA encryption currently.
t.string "encrypted_password", null: false
t.inet "last_sign_in_ip" # FIXME: Legacy, we don't use it anymore because GDPR
t.datetime "last_sign_in_at"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
# ...
end
end
Now, the_schema_is
gem consists of this DSL and custom Rubocop cops which check the correspondence of this DSL in model classes to your db/schema.rb
(and can automatically fix discrepancies found).
Using existing Rubocop's infrastructure brings several great benefits:
- you can include checking "if all annotations are actual" in your CI/pre-commit hooks easily;
- you can preview problems found, and then fix them automatically (with
rubocop -a
) or manually however you see suitable; - the changes made with auto-correct is very local (just add/remove/change line related to relevant column), so your custom structuring, like separating groups of related columns with empty lines and comments, will be preserved;
- rubocop is easy to run on some sub-folder or one file, or files corresponding to some pattern; or exclude permanently for some file or folder.
But what the block itself does?
Nothing.
Ugh... What?
That's just how it is (at least for now) ¯\_(ツ)_/¯
The block isn't even evaluated at all (so potentially can contain any code, and only Rubocop's cop will complain). In the future, it can do some useful things (like, on app run in development environment compare scheme of the real DB with declarations in class), but for now, it is just noop declarative schema copy-paste.
Usage
- Add to your Gemfile
gem 'the_schema_is'
and runbundle install
. - Add to your
.rubocop.yml
this:
require:
- the_schema_is/cops
- Run
rubocop
and see what it now says about your models. - Now you can add schema definitions manually, or allow
rubocop --auto-correct
(or-a
) to do its job! NB: you can always userubocop --auto-correct --only TheSchemaIs
to auto-correct ONLY this schema thing
To make reporting cleaner, all cops are split into:
Presence
WrongTableName
MissingColumn
UnknownColumn
WrongColumnDefinition
It is not advisable to selectively turn them off, but you may know better (for example, some may experiment with leaving in models just t.<type> '<name>'
without details about defaults and limit, and therefore turn off WrongColumnDefinition
), all of it is pretty experimental!
Setting
the_schema_is
cops support some configuration, which should be done on the namespace level in your .rubocop.yml
, for example:
TheSchemaIs:
Schema: db/other-schema-file.rb
Currently available settings are:
-
TablePrefix
to helpthe_schema_is
deduce table name from class name; -
Schema
to set path to schema (by defaultdb/schema.rb
); -
BaseClass
to helpthe_schema_is
guess what is a model class (by defaultApplicationRecord
andActiveRecord::Base
); -
RemoveDefinitions
: list of definition keys to remove (for example,[index, foreign_key, limit]
) when copying definitions into models; this might be desirable for leanerthe_schema_is
statements, displaying only field types/names.
So, if you have your custom-named base class, you should do:
TheSchemaIs:
BaseClass: OurOwnBase
Note that Rubocop allows per-folder settings out of the box, which allows TheSchemaIs to support complicated configurations with multiple databases and engines.
For example, consider your models are split into app/models/users/
and app/models/products
which are stored in the different databases, then you probably have different schemas and base classes for them. So, to configure it properly, you might want to do in app/models/users/.rubocop.yml
:
# Don't forget this for all other cop settings to not be ignored
inherit_from: ../../../.rubocop.yml
TheSchemaIs:
BaseClass: Users::BaseRecord
Schema: db/users_schema.rb
Some Q&A
-
Q: It doesn't check the actual DB?
- A: No, it does not! At the current moment, our belief is that in a healthy Rails codebase
schema.rb
is always corresponding to DB state, so checking against it is enough. This approach makes the tooling much easier (with existing Rubocop's ecosystem of parsers/offenses/configurations).
- A: No, it does not! At the current moment, our belief is that in a healthy Rails codebase
-
Q: What if I don't use Rubocop?
- A: You may want to try, at least? Do you know that you may disable or configure most of its checks to your liking? And auto-correct any code to your preferences?.. Or automatically create "TODO" config-file (which disables all the cops currently raising offenses, and allows to review them and later setup one-by-one)?.. It is much more than "linter making your code to complain about some rigid style guide".
-
Q: Cool, but I still don't want to.
- A: ...OK, then you can disable all cops except for
TheSchemaIs
namespace :)
- A: ...OK, then you can disable all cops except for
-
How do I annotate my fabrics, model specs, routes, controllers, ... (which
annotate
allows)?- A: You don't. The same way you don't copy-paste the whole definition of the class into spec file which tests this class: Definition is in one place, tests and other code using this definition is another. DRY!
-
Rubocop is unhappy with the code
TheSchemaIs
generated.- A: There are two known things in generated
the_schema_is
blocks that Rubocop may complain about:- Usage of double quotes for strings, if your config insists on single quotes: that's because we just copy code objects from
schema.rb
. Rubocop's auto-correct will fix it :) (Even in one run: "fixing TheSchemaIs, then fixing quotes"); - Too long blocks (if you have tables with dozens of columns, God forbid... as we do). It can be fixed by adding this to
.rubocop.yml
:
- Usage of double quotes for strings, if your config insists on single quotes: that's because we just copy code objects from
Metrics/BlockLength: ExcludedMethods: - the_schema_is
- A: There are two known things in generated