0.0
The project is in a healthy, maintained state
Data migrations kit for ActiveRecord and ActiveJob
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies
 Project Readme

data-migration.rb

Gem Version Test Status codecov

Data migrations kit for ActiveRecord and ActiveJob.

Sponsored by Kisko Labs.

Data migrations concept

  • A short-living script that is manually applied to database
  • Not reversible
  • Can be applied multiple times
  • Accompanied by ActiveJob for background and batch operations
  • Accompanied by ActiveRecord to control and audit migrations progress
  • Operator's responsibility to ensure data consistency, notifications, monitoring and quality of implementation

Data migrations process

  1. Avoid implementing and running data migrations within schema migrations
  2. Data migrations should be planned beforehand, reserve time in the calendar
  3. Data migrations should be always controlled by operator
  4. Wrapping queries to transactions might lead to large memory consumption, unexpected exceptions and database unresponsiveness
  5. Large data migrations should have batching implemented which will lower memory consumption and database load
  6. Critical data migrations should be covered with tests, by finding consensus developers decide if migration is critical
  7. Before running critical data migrations, make sure that you have fresh backup of the database and you are ready to rollback in case of failure

Installation

Using Bundler:

bundle add data-migration

Using RubyGems:

gem install data-migration

Gemfile

gem "data-migration"

Data migration tasks table

bin/rails g data_migration:install data_migration_tasks

Usage

Generate data migration job

bin/rails g data_migration create_users

Run data migrations

bin/rails db:migrate:data 20241207120000_create_users

Configuration

Set data migrations directory

Absolute path will be resolved by using Rails.root.

DataMigration.config.data_migrations_path = "db/data_migrations"

Turn off test script generation

DataMigration.config.generate_spec = false

Batch operations

Batch operations are supported by using enqueue method, it will automatically enqueue or perform next job depending on background option.

enqueue method calls are tracked within a single Thread, it should be used within a single job execution, also all enqueue calls rewrite each other and only last call will be used for enqueuing next job after the current job is completed.

def perform(index: 1, background: true)
  return if index > 2

  User.find_or_create_by(email: "test_#{index}@example.com")

  enqueue(index: index + 1, background:)
end

Specification checklist

  • User can generate data migration file under db/data_migrations directory with common format
  • User can generate data migration file with test script included
  • User can run specific data migration using Rails console
  • User can run specific data migration using shell command
  • User can run data migration in background
  • User can run data migration in foreground
  • User can specify operator for data migration
  • User can specify monitoring context for data migration
  • User can specify pause time for data migration
  • User can specify jobs limit for data migration
  • User receives an error when data migration is applied within schema migration

Limitations & explanations

  • ActiveRecord migrations generator is used to generate data migration files
  • Data migrations are not reversible, it is operator's responsibility to ensure that data migration has correct effect
  • Keep migrations logic stable and predictable, e.g. by checking uniqueness of created/updated records

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/amkisko/data-migration.rb

Contribution policy:

  • New features are not necessarily added to the gem
  • Pull request should have test coverage for affected parts
  • Pull request should have changelog entry
  • It might take up to 2 calendar weeks to review and merge critical fixes
  • It might take up to 6 calendar months to review and merge pull request
  • It might take up to 1 calendar year to review an issue

Publishing

Prefer using script usr/bin/release.sh, it will ensure that repository is synced and after publishing gem will create a tag.

GEM_VERSION=$(grep -Eo "VERSION\s*=\s*\".+\"" lib/data-migration.rb  | grep -Eo "[0-9.]{5,}")
rm data-migration-*.gem
gem build data-migration.gemspec
gem push data-migration-$GEM_VERSION.gem
git tag $GEM_VERSION && git push --tags && gh release create $GEM_VERSION --generate-notes

License

The gem is available as open source under the terms of the MIT License.