ForgetThat
ForgetThat is a tool to take care of critical data in your database. It replaces the critical pieces of data with anonymized data, according to pre-set per-application policy.
Important notice
When misconfigured and/or misused this gem can effectively wipe important data from the database. Be responsible and test before running on production data.
Prerequisites
- Ruby ~> 2.6.0
- ActiveRecord ~> 5
Installation
Add this line to your application's Gemfile:
gem 'forget_that'
And then execute:
$ bundle
Configuration
Before gem could be used, a config in config/anonymization_config.yml
in must be created:
config: # config part is only valid for the `call` method which will write in db
retention_time: # defines the newest record to be anonymized when `call` is used
value: 90
unit: 'days'
schema:
table1:
name: 'Peter'
table2:
phone: '%{random_phone}'
Pay attention that default placeholders are random_date
, hex_string
, random_phone
, fake_personal_id_number
, random_amount
. You can add your own placeholders by supplying them to initializer (see below)
Database migration
After you created a config you can generate migration that adds anonymization metadata to corresponding tables:
$ rails g forget_that:install
Do not forget to run the migration:
$ rails db:migrate
Usage
In order to run the service, you can create an instance and use the call
method:
ForgetThat::Service.new.call
Calling that will find records older then retention_time
in your configured tables and replace the configured fields with configured values.
If some of the placeholders are not supplied, or some tables do not contain anonymization
flag the error will be raised.
Sidekiq
The case gem is used originally is data anonymization with accordance to data protection regulations in EU. Reducing the amount of sensitive information, after transactions are complete is the safest bet when it comes to data security.
This can be achieved through setting up a sidekiq
worker:
class AnonymizeCustomerData
include Sidekiq::Worker
sidekiq_options retry: 10
def perform
Rails.logger.info('[AnonymizeCustomerData.perform] start')
ForgetThat::Service.new.call
Rails.logger.info('[AnonymizeCustomerData.perform] done')
end
end
Then you can use tool like sidekiq-cron
in order to schedule it.
Rake-task for using production data locally
Another use might be when a developer dumps a production database in order to play with it locally.
To be on the safe side and to not compromise sensitive data, the gem might be configured in the following way:
config:
retention_time:
value: 0
unit: 'seconds'
schema:
# your anonymization schema
Then gem can be invoked from the rake-task. It is your responsibility to ensure that it never runs on production.
Non-destructive use
Anonymizers might be used with collection ActiveRecord::Relation
supplied, not affecting any database.
For example:
anonymizer = ForgetThat::Service.new
collection = Address.where(created_at: Time.current - 40.days)
anonymizer.sanitize_collection(collection)
This will return an array of Hashes, corresponding to the records in collection
. All fields configured to be anonymized, will be anonymized, ids will be stripped, the rest will be provided as is. This method ignores retention_time
.
Custom placeholders
The default placeholders are random_date
, hex_string
, random_phone
, fake_personal_id_number
, random_amount
. In some cases this might not be enough or behaviour might not be desireable. In that case you can supply anonymizers
hash.
ForgetThat::Service.new(
anonymizers: {
foobar: -> { 'Foo' + 'Bar' }
}
).call
Each member of this hash must be a zero-arity lambda that returns a string value.
If the key in the hash matches one of the pre-defined placeholders, the pre-defined placeholder will be overridden by the new one.
After anonymizer was supplied with the lambda, it can be used in the config.
# ...
schema:
users:
name: 'Peter %{foobar}' #results in the "name" column of table "users" filled with "Peter FooBar"
Development
After checking out the repo, run bin/setup
to install dependencies. Then, run rake spec
to run the tests. You can also run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/vehiculum-berlin/forget_that.