SensitiveDataFilter
A Rack Middleware filter for sensitive data
Installation
Add this line to your application's Gemfile:
gem 'sensitive_data_filter'
And then execute:
$ bundle
Or install it yourself as:
$ gem install sensitive_data_filter
Usage
Enable the middleware
Insert the middleware in the stack before any parameter parsing is performed.
E.g. for Rails, add the following in application.rb
# --- Sensitive Data Filtering ---
config.middleware.insert_before 'ActionDispatch::ParamsParser', SensitiveDataFilter::Middleware::Filter
To ensure that no sensitive data is accessed at any level of the stack, insert the middleware at the top of the stack.
E.g.
# --- Sensitive Data Filtering ---
config.middleware.insert_before 0, SensitiveDataFilter::Middleware::Filter
Important note for Rails
Rails logs the URI of the request in Rails::Rack::Logger
. At this point of the stack, Rails generally has not yet set the session in the env.
If you insert the sensitive data filtering middleware before this middleware you will prevent sensitive data from appearing in the logs,
but you will not have access to the session via the occurrence or the env in the occurrence handling block.
Configuration
SensitiveDataFilter.config do |config|
config.enable_types :credit_card # Already defaults to :credit_card if not specified
config.on_occurrence do |occurrence|
# Report occurrence
end
config.whitelist pattern1, pattern2 # Allows specifying patterns to whitelist matches
config.whitelist_key key_pattern1, key_pattern2 # Allows specifying patterns to whitelist hash values based on their keys
config.register_parser('yaml', -> params { YAML.load params }, -> params { YAML.dump params })
end
An occurrence object has the following properties:
- origin_ip: the IP address that originated the request
- request_method: the HTTP method for the request (GET, POST, etc.)
- url: the URL of the request
- content_type: the Content-Type of the request
- original_query_params: the query parameters sent with the request
- original_body_params: the body parameters sent with the request
- filtered_query_params: the query parameters sent with the request, with sensitive data filtered
- filtered_body_params: the body parameters sent with the request, with sensitive data filtered
- session: the session properties for the request
- matches: the matched sensitive data
- matches_count: the number of matches per data type, e.g. { 'CreditCard' => 1 }
- original_env: the original unfiltered Rack env
- changeset: the modified rack env variables
It also exposes to_h
and to_s
methods for hash and string representation respectively.
Please note that these representations omit sensitive data,
i.e. original_query_params
, original_body_params
and matches
are not included.
Important Notes
Body parameters will not be parsed if a parser for the request's content type is not defined.
You might want to filter sensitive parameters (e.g: passwords). In Rails you can do something like:
filters = Rails.application.config.filter_parameters
filter = ActionDispatch::Http::ParameterFilter.new filters
filtered_query_params = filter.filter @occurrence.filtered_query_params
filtered_body_params = if @occurrence.filtered_body_params.is_a? Hash
filter.filter @occurrence.filtered_body_params
else
@occurrence.filtered_body_params
end
Whitelisting
A list of whitelisting patterns can be passed to config.whitelist
.
Any sensitive data match which also matches any of these patterns will be ignored.
A list of whitelisting patterns can be passed to config.whitelist_key
.
When scanning and matching hashes, any value whose key matches any of these patterns will be ignored.
Parameter Parsing
Parsers for parameters encoded for a specific content type can be defined.
The arguments for config.register_parser
are:
- a pattern to match the content type
- a parser for the parameters
- an unparser to convert parameters back to the encoded format
The parser and unparser must be objects that respond to call
and accept the parameters as an argument (e.g. procs or lambdas).
The parser should handle parsing exceptions gracefully by returning the arguments.
This ensures that sensitive data scanning and masking is applied on the raw parameters.
Development
After checking out the repo, run bin/setup
to install dependencies. Then, run rake spec
to run the tests. You can also run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
Release
To publish a new version of this gem the following steps must be taken.
- Update the version in the following files
CHANGELOG.md lib/sensitive_data_filter/version.rb
- Create a tag using the format v0.1.0
- Follow build progress in GitHub actions
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/sealink/sensitive_data_filter. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.
License
The gem is available as open source under the terms of the MIT License.