Project

bloomed

0.0
No commit activity in last 3 years
No release in over 3 years
Check your users' passwords using a fraction of the memory of the full pwned passwords list.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

Runtime

~> 1.0
 Project Readme

Bloomed

Troy Hunt's brilliant haveibeenpwned.com let's you download SHA1s of 517,238,891 real world passwords previously exposed in data breaches. This list is comprehensive but huge in size: 11GB compressed. Using a bloom filter we can reduce the size down to files measured in MBs.

You can even keep a the bloom filter in memory in your web app or api. This is great if you're afraid to send the passwords that your users enter, to an external service for lookup.

This gem will let you control the trade off between memory size and precision. False positives will occur (that's the nature of bloom filters), but you control the frequency and how many of the pwned passwords you want in your filter, starting from the most pwned at the top.

Installation

Add this line to your application's Gemfile:

gem 'bloomed'

And then execute:

$ bundle

Or install it yourself as:

$ gem install bloomed

Usage

Quick start

require 'bloomed'
pw = Bloomed::PW.new
pw.pwned? "password123"
=> true

Using lower precision / lower memory consumption

There are two parameters that can be varied: top and false_positive_probability.

require 'bloomed'
pw = Bloomed::PW.new(top: 100000, false_positive_probability: 0.01) # 136 kb memory
pw.pwned? "password123"
=> true

Using higher precision / higher memory consumption

To keep the gem size small, it only ships with dumps up to 253 kb in size.

To generate all combinations of top and false_positive_probability bloom filters for pwned passwords, run:

rake bloomed:seed

This will download the source 7zip file with pwned passwords, unpack it to the current dir, write the generated bloom filters in the lib/dump dir relative to the installation path of the gem.

Note: You'll need to brew install curl p7zip on macos and apt-get install curl p7zip on linux.

Sometimes you will want to have more precise control of the placement of the cache files. To seed all variants in the current dir, run:

rake bloomed:seed\[all\]

But be aware that it will take a long time!

Once you have the massive 22GB text files available, you can generate binary cache files using the exact precision you want.

require 'bloomed'
pw = Bloomed::PW.new(top:1E9, false_positive_probability: 0.0001)

Warning! This seeds all the passwords and will take a loooong time the first time. Even with a binary cache file in place loading it will take massive time and memory.

For deployment scenarios where you don't want the server to rake seed, you can override the directory used for caching by giving a cache_dir argument to the constructor:

require 'bloomed'
pw = Bloomed::PW.new(top:1E8, false_positive_probability: 0.0001, cache_dir: '/var/lib/bloomed'
)

Size of the in memory bloom filter

The filter can vary much in size. Use Bloomed:PW#memory_size_bytes to get the exact size.

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/skovsboll/bloomed. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

License

The gem is available as open source under the terms of the MIT License.

Code of Conduct

Everyone interacting in the Bloomed project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.