Active Record Doctor
Active Record Doctor helps to keep the database in a good shape. Currently, it can detect:
- extraneous indexes -
active_record_doctor:extraneous_indexes
- unindexed
deleted_at
columns -active_record_doctor:unindexed_deleted_at
- missing foreign key constraints -
active_record_doctor:missing_foreign_keys
- models referencing undefined tables -
active_record_doctor:undefined_table_references
- uniqueness validations not backed by a unique index -
active_record_doctor:missing_unique_indexes
- missing non-
NULL
constraints -active_record_doctor:missing_non_null_constraint
- missing presence validations -
active_record_doctor:missing_presence_validation
- incorrect presence validations on boolean columns -
active_record_doctor:incorrect_boolean_presence_validation
- mismatches between model length validations and database validation constraints -
active_record_doctor:incorrect_length_validation
- incorrect values of
dependent
on associations -active_record_doctor:incorrect_dependent_option
- primary keys having short integer types -
active_record_doctor:short_primary_key_type
- mismatched foreign key types -
active_record_doctor:mismatched_foreign_key_type
- tables without primary keys -
active_record_doctor:table_without_primary_key
- tables without timestamps -
active_record_doctor:table_without_timestamps
It can also:
- index unindexed foreign keys -
active_record_doctor:unindexed_foreign_keys
Installation
In order to use the latest production release, please add the following to
your Gemfile
:
gem 'active_record_doctor', group: [:development, :test]
and run bundle install
. If you'd like to use the most recent development
version then use this instead:
gem 'active_record_doctor', github: 'gregnavis/active_record_doctor', group: [:development, :test]
That's it when it comes to Rails projects. If your project doesn't use Rails
then you can use active_record_doctor
via Rakefile
.
Additional Installation Steps for non-Rails Projects
If your project uses Rake then you can add the following to Rakefile
in order
to use active_record_doctor
:
require "active_record_doctor"
ActiveRecordDoctor::Rake::Task.new do |task|
# Add project-specific Rake dependencies that should be run before running
# active_record_doctor.
task.deps = []
# A path to your active_record_doctor configuration file.
task.config_path = ::Rails.root.join(".active_record_doctor.rb")
# A Proc called right before running detectors that should ensure your Active
# Record models are preloaded and a database connection is ready.
task.setup = -> { ::Rails.application.eager_load! }
end
IMPORTANT. active_record_doctor
expects that after running deps
and
calling setup
your Active Record models are loaded and a database connection
is established.
Usage
active_record_doctor
can be used via rake
or rails
.
You can run all available detectors via:
bundle exec rake active_record_doctor
You can run a specific detector via:
bundle exec rake active_record_doctor:extraneous_indexes
Continuous Integration
If you want to use active_record_doctor
in a Continuous Integration setting
then ensure the configuration file is committed and run the tool as one of your
build steps -- it returns a non-zero exit status if any errors were reported.
Obtaining Help
If you'd like to obtain help on a specific detector then use the help
sub-task:
bundle exec rake active_record_doctor:extraneous_indexes:help
This will show the detector help text in the terminal, along with supported configuration options, their meaning, and whether they're global or local.
Debug Logging
It may be that active_record_doctor
fails with an exception and it is hard to tell
what went wrong. For easier debugging, use ACTIVE_RECORD_DOCTOR_DEBUG
environment variable.
If active_record_doctor
fails for some reason for your application, feel free
to open an issue or a PR with the fix.
ACTIVE_RECORD_DOCTOR_DEBUG=1 bundle exec rake active_record_doctor
Configuration
active_record_doctor
can be configured to better suit your project's needs.
For example, if it complains about a model that you want ignored then you can
add that model to the configuration file.
If you want to use the default configuration then you don't have to do anything.
Just run active_record_doctor
in your project directory.
If you want to customize the tool you should create a file named
.active_record_doctor.rb
in your project root directory with content like:
ActiveRecordDoctor.configure do
# Global settings affect all detectors.
global :ignore_tables, [
# Ignore internal Rails-related tables.
"ar_internal_metadata",
"schema_migrations",
"active_storage_blobs",
"active_storage_attachments",
"action_text_rich_texts",
# Add project-specific tables here.
"legacy_users"
]
# Detector-specific settings affect only one specific detector.
detector :extraneous_indexes,
ignore_tables: ["users"],
ignore_indexes: ["accounts_on_email_organization_id"]
end
The configuration file above will make active_record_doctor
ignore internal
Rails tables (which are ignored by default) and also the legacy_users
table.
It'll also make the extraneous_indexes
detector skip the users
table
entirely and will not report the index named accounts_on_email_organization_id
as extraneous.
Configuration options for each detector are listed below. They can also be obtained via the help mechanism described in the previous section.
Regexp-Based Ignores
Settings like ignore_tables
, ignore_indexes
, and so on accept list of
identifiers to ignore. These can be either:
- Strings - in which case an exact match is needed.
- Regexps - which are matched against object names, and matching ones are excluded from output.
For example, to ignore all tables starting with legacy_
you can write:
ActiveRecordDoctor.configure do
global :ignore_tables, [
# Ignore internal Rails-related tables.
"ar_internal_metadata",
"schema_migrations",
"active_storage_blobs",
"active_storage_attachments",
"action_text_rich_texts",
# Ignore all legacy tables.
/^legacy_/
]
end
Indexing Unindexed Foreign Keys
Foreign keys should be indexed unless it's proven ineffective. However, Rails makes it easy to create an unindexed foreign key. Active Record Doctor can automatically generate database migrations that add the missing indexes. It's a three-step process:
- Generate a list of unindexed foreign keys by running
bundle exec rake active_record_doctor:unindexed_foreign_keys > unindexed_foreign_keys.txt
-
Remove columns that should not be indexed from
unindexed_foreign_keys.txt
as a column can look like a foreign key (i.e. ending with_id
) without being one. -
Generate the migrations
rails generate active_record_doctor:add_indexes unindexed_foreign_keys.txt
- Run the migrations
bundle exec rake db:migrate
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_tables
- tables whose foreign keys should not be checked -
ignore_columns
- columns, written as table.column, that should not be checked.
Removing Extraneous Indexes
Let me illustrate with an example. Consider a users
table with columns
first_name
and last_name
. If there are two indexes:
- A two-column index on
last_name, first_name
. - A single-column index on
last_name
.
Then the latter index can be dropped as the former can play its role. In
general, a multi-column index on column_1, column_2, ..., column_n
can replace
indexes on:
column_1
column_1, column_2
- ...
column_1, column_2, ..., column_(n - 1)
To discover such indexes automatically just follow these steps:
- List extraneous indexes by running:
bundle exec rake active_record_doctor:extraneous_indexes
-
Confirm that each of the indexes can be indeed dropped.
-
Create a migration to drop the indexes.
The indexes aren't dropped automatically because there are usually just a few of them and it's a good idea to double-check that you won't drop something necessary.
Also, extra indexes on primary keys are considered extraneous too and will be reported.
Note that a unique index can never be replaced by a non-unique one. For
example, if there's a unique index on users.login
and a non-unique index on
users.login, users.domain
then the tool will not suggest dropping
users.login
as it could violate the uniqueness assumption. However, a unique
index on users.login, user.domain
might be replaceable with users.login
as
the uniqueness of the latter implies the uniqueness of the former (if a given
login
can appear only once then it can be present in only one login, domain
pair).
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_tables
- tables whose indexes should never be reported as extraneous. -
ignore_indexes
- indexes that should never be reported as extraneous.
Detecting Unindexed deleted_at
Columns
If you soft-delete some models (e.g. with paranoia
) then you need to modify
your indexes to include only non-deleted rows. Otherwise they will include
logically non-existent rows. This will make them larger and slower to use. Most
of the time they should only cover columns satisfying deleted_at IS NULL
(to
cover existing records) or deleted_at IS NOT NULL
(to cover deleted records).
active_record_doctor
can automatically detect indexes on tables with a
deleted_at
column. Just run:
bundle exec rake active_record_doctor:unindexed_deleted_at
This will print a list of indexes that don't have the deleted_at IS NULL
clause. Currently, active_record_doctor
cannot automatically generate
appropriate migrations. You need to do that manually.
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_tables
- tables whose indexes should not be checked. -
ignore_columns
- specific columns, written as table.column, that should not be reported as unindexed. -
ignore_indexes
- specific indexes that should not be reported as excluding a timestamp column. -
column_names
- deletion timestamp column names.
Detecting Missing Foreign Key Constraints
If users.profile_id
references a row in profiles
then this can be expressed
at the database level with a foreign key constraint. It forces
users.profile_id
to point to an existing row in profiles
. The problem is
that in many legacy Rails apps, the constraint isn't enforced at the database
level.
active_record_doctor
can automatically detect foreign keys that could benefit
from a foreign key constraint (a future version will generate a migration that
add the constraint; for now, it's your job). You can obtain the list of foreign
keys with the following command:
bundle exec rake active_record_doctor:missing_foreign_keys
In order to add a foreign key constraint to users.profile_id
use a migration
like:
class AddForeignKeyConstraintToUsersProfileId < ActiveRecord::Migration
def change
add_foreign_key :users, :profiles
end
end
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_tables
- tables whose columns should not be checked. -
ignore_columns
- columns, written as table.column, that should not be checked.
Detecting Models Referencing Undefined Tables
Active Record guesses the table name based on the class name. There are a few cases where the name can be wrong (e.g. you forgot to commit a migration or changed the table name). Active Record Doctor can help you identify these cases before they hit production.
IMPORTANT. Models backed by views are supported only in:
- Rails 5+ and any database or
- Rails 4.2 with PostgreSQL.
The only thing you need to do is run:
bundle exec rake active_record_doctor:undefined_table_references
If there a model references an undefined table then you'll see a message like this:
Contract references a non-existent table or view named contract_records
On top of that rake
will exit with a status code of 1. This allows you to use
this check as part of your Continuous Integration pipeline.
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_models
- models whose underlying tables should not be checked for existence.
Detecting Uniqueness Validations not Backed by an Index
Model-level uniqueness validations, has_one
and has_and_belongs_to_many
associations should be backed by a database index in order to be robust.
Otherwise you risk inserting duplicate values under a heavy load.
In order to detect such validations run:
bundle exec rake active_record_doctor:missing_unique_indexes
If there are such indexes then the command will print:
add a unique index on users(email) - validating uniqueness in the model without an index can lead to duplicates
This means that you should create a unique index on users.email
.
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_models
- models whose uniqueness validators should not be checked. -
ignore_columns
- specific validators, written as Model(column1, ...), that should not be checked. -
ignore_join_tables
- join tables that should not be checked for existence of unique indexes.
Detecting Missing Non-NULL
Constraints
If there's an unconditional presence validation on a column then it should be
marked as non-NULL
-able at the database level or should have a IS NOT NULL
constraint.
In order to detect columns whose presence is required but that are marked
null: true
in the database run the following command:
bundle exec rake active_record_doctor:missing_non_null_constraint
The output of the command is similar to:
add `NOT NULL` to users.name - models validates its presence but it's not non-NULL in the database
You can mark the columns mentioned in the output as null: false
by creating a
migration and calling change_column_null
.
This validator skips models whose corresponding database tables don't exist.
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_tables
- tables whose columns should not be checked. -
ignore_columns
- columns, written as table.column, that should not be checked.
Detecting Missing Presence Validations
If a column is marked as null: false
then it's likely it should have the
corresponding presence validator.
In order to detect models lacking these validations run:
bundle exec rake active_record_doctor:missing_presence_validation
The output of the command looks like this:
add a `presence` validator to User.email - it's NOT NULL but lacks a validator
add a `presence` validator to User.name - it's NOT NULL but lacks a validator
This means User
should have a presence validator on email
and name
.
This validator skips models whose corresponding database tables don't exist.
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_models
- models whose underlying tables' columns should not be checked. -
ignore_attributes
- specific attributes, written as Model.attribute, that should not be checked. -
ignore_columns_with_default
- set totrue
to ignore columns with default values.
Detecting Incorrect Presence Validations on Boolean Columns
A boolean column's presence should be validated using inclusion or exclusion validators instead of the usual presence validator.
In order to detect boolean columns whose presence is validated incorrectly run:
bundle exec rake active_record_doctor:incorrect_boolean_presence_validation
The output of the command looks like this:
replace the `presence` validator on User.active with `inclusion` - `presence` can't be used on booleans
This means active
is validated with presence: true
instead of
inclusion: { in: [true, false] }
or exclusion: { in: [nil] }
.
This validator skips models whose corresponding database tables don't exist.
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_models
- models whose validators should not be checked. -
ignore_columns
- attributes, written as Model.attribute, whose validators should not be checked.
Detecting Incorrect Length Validations
String length can be enforced by both the database and the application. If there's a database limit then it's a good idea to add a model validation to ensure user-friendly error messages. Similarly, if there's a model validator without the corresponding database constraint then it's a good idea to add one to avoid saving invalid models.
In order to detect columns whose length isn't validated properly run:
bundle exec rake active_record_doctor:incorrect_length_validation
The output of the command looks like this:
set the maximum length in the validator of User.email (currently 32) and the database limit on users.email (currently 64) to the same value
add a length validator on User.address to enforce a maximum length of 64 defined on users.address
The first message means the validator on User.email
is checking for a
different maximum than the database limit on users.email
. The second message
means there's a database limit on users.address
without the corresponding
model validation.
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_models
- models whose validators should not be checked. -
ignore_attributes
- attributes, written as Model.attribute, whose validators should not be checked.
Detecting Incorrect dependent
Option on Associations
Cascading model deletions can be sped up with dependent: :delete_all
(to
delete all dependent models with one SQL query) but only if the deleted models
have no callbacks as they're skipped.
This can lead to two types of errors:
- Using
delete_all
when dependent models define callbacks - they will NOT be invoked. - Using
destroy
when dependent models define no callbacks - dependent models will be loaded one by one with no reason
In order to detect associations affected by the two aforementioned problems run the following command:
bundle exec rake active_record_doctor:incorrect_dependent_option
The output of the command looks like this:
use `dependent: :delete_all` or similar on Company.users - associated models have no validations and can be deleted in bulk
use `dependent: :destroy` or similar on Post.comments - the associated model has callbacks that are currently skipped
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_models
- models whose associations should not be checked. -
ignore_associations
- associations, written as Model.association, that should not be checked.
Detecting Primary Keys Having Short Integer Types
Active Record 5.1 changed the default primary and foreign key type from INTEGER to BIGINT. The reason is to reduce the risk of running out of IDs on inserts.
In order to detect primary keys using shorter integer types, for example created before migrating to 5.1, you can run the following command:
bundle exec rake active_record_doctor:short_primary_key_type
The output of the command looks like this:
change the type of companies.id to bigint
The above means companies.id
should be migrated to a wider integer type. An
example migration to accomplish this looks like this:
class ChangeCompaniesPrimaryKeyType < ActiveRecord::Migration[5.1]
def change
change_column :companies, :id, :bigint
end
end
IMPORTANT. Running the above migration on a large table can cause downtime as all rows need to be rewritten.
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_tables
- tables whose primary keys should not be checked.
Detecting Mismatched Foreign Key Types
Foreign keys should be of the same type as the referenced primary key. Otherwise, there's a risk of bugs caused by IDs representable by one type but not the other.
Running the command below will list all foreign keys whose type is different from the referenced primary key:
bundle exec rake active_record_doctor:mismatched_foreign_key_type
The output of the command looks like this:
companies.user_id references a column of a different type - foreign keys should be of the same type as the referenced column
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_tables
- tables whose foreign keys should not be checked. -
ignore_columns
- foreign keys, written as table.column, that should not be checked.
Detecting Tables Without Primary Keys
Tables should have primary keys. Otherwise, it becomes problematic to easily find a specific record, logical replication in PostgreSQL will be troublesome, because all the rows need to be unique in the table then etc.
Running the command below will list all tables without primary keys:
bundle exec rake active_record_doctor:table_without_primary_key
The output of the command looks like this:
add a primary key to companies
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_tables
- tables whose primary key existence should not be checked
Detecting Tables Without Timestamps
Tables should have timestamp columns (created_at
/updated_at
). Otherwise, it becomes problematic
to easily find when the record was created/updated, if the table is active or can be removed,
automatic Rails cache expiration after record updates is not possible.
Running the command below will list all tables without default timestamp columns:
bundle exec rake active_record_doctor:table_without_timestamps
The output of the command looks like this:
add a created_at column to companies
Supported configuration options:
-
enabled
- set tofalse
to disable the detector altogether -
ignore_tables
- tables whose timestamp columns existence should not be checked
Ruby and Rails Compatibility Policy
The goal of the policy is to ensure proper functioning in reasonable combinations of Ruby and Rails versions. Specifically:
- If a Rails version is officially supported by the Rails Core Team then it's
supported by
active_record_doctor
. - If a Ruby version is compatible with a supported Rails version then it's
also supported by
active_record_doctor
. - Only the most recent teeny Ruby versions and patch Rails versions are supported.
Author
This gem is developed and maintained by Greg Navis.