Heroku Vector
___ ___ __
/ | \ ____ _______ ____ | | __ __ __
/ ~ \_/ __ \ \_ __ \ / _ \ | |/ /| | \
\ Y /\ ___/ | | \/( <_> )| < | | /
\___|_ / \___ > |__| \____/ |__|_ \|____/
____ ____/ __ \/
\ \ / / ____ ____ _/ |_ ____ _______
\ Y / _/ __ \ _/ ___\ \ __\ / _ \ \_ __ \
\ / \ ___/ \ \___ | | ( <_> ) | | \/
\___/ \___ > \___ > |__| \____/ |__|
\/ \/
Simple, linear auto scaling for Heroku dynos.
Heroku Vector runs as a multi-threaded process that samples production metrics and linearly scales Heroku dynos up or down as they change.
Web Dynos can be scaled using the amount of traffic (RPM) from NewRelic. Sidekiq dynos can be scaled based on how many worker threads are busy.
Installation
Add to Rails app on Heroku, running in a Heroku Dyno:
Add to Gemfile:
gem "heroku-vector"
Add to Procfile:
vector: heroku_vector -c config/heroku_vector.rb
Setup auto-scaling config in config/heroku_vector.rb
Install on a stand-alone system
Install this gem:
$ gem install heroku-vector
And then run the scaler proces:
$ heroku_vector --help
Configuration
This auto scaler will sample metrics and use the metric values to scale different types of Heroku Dynos up or down.
# Example config.rb file
HerokuVector.configure do |config|
# Scale 1 web dyno for each 300rpm of traffic
config.add_dyno_scaler('web', {
source: HerokuVector::Source::NewRelic,
period: 60,
min_dynos: 1,
max_dynos: 10,
min_value: 100,
max_value: 300
})
# Manually specify app config in config.rb like so
# Alternately, use environment variables
config.newrelic_api_key = '222222222222222'
end
Application config can be wired up in the config.rb file, or passed in via the system Environment or an Environment file:
HEROKU_APP_NAME=your-app-name
HEROKU_API_KEY=1111111111111111
REDIS_URL=redis://redis.yourcompany.com/
SIDEKIQ_REDIS_NAMESPACE=sidekiq
NEWRELIC_API_KEY=222222222222222
Defining Your Dyno Scalers
The core configuration within Heroku Vector is defining a set of Dyno Scalers
that monitor a type of Heroku dyno (like web
or worker
dynos). The DynoScaler monitors the number of this type of dynos and the amount of traffic they are getting, by sampling data from the traffic source. Heroku Vector provides a simple DSL for adding a Dyno Scaler:
HerokuVector.configure do |config|
# App Settings go here
# Define a dyno scaler with:
config.add_dyno_scaler(DYNO_NAME, options={})
#...
end
Each dyno scaler requires the name of the type of dyno (eg: web
), and some options.
Options
-
min_value
- REQUIRED - Minimum amount of traffic per-dyno (scale down when traffic below) -
max_vaue
- REQUIRED - Maximum amount of trafic per-dyno (scale up when traffic is above) -
source
- REQUIRED - Traffic Data Source to sample, either a Ruby class or string class name. (eg:HerokuVector::Source::NewRelic
) -
period
- OPTIONAL - How often to sample the traffic source, in seconds, defaults to every 1 minute (60 sec) -
min_dynos
- OPTIONAL - Minimum number of dynos to have running at all times, defaults to 2. -
max_dynos
- OPTIONAL - Maximum number of dynos to run at any time, defaults to 10. -
scale_up_by
- OPTIONAL - Number of dynos to scale up by, defaults to 1 at a time. -
scale_down_by
- OPTIONAL - Number of dynos to scale down by, defaults to 1 at a time.
Application Settings
Settings can be configured via the HerokuVector.config do {|config| }
API or as Environment variables or an environment file for sensitive values like API keys. Here are the top-level application settings that can be changed:
-
config.logger
- Ruby Logger instance, defaults to a built-in logger wrappingSTDOUT
-
config.engine
- Scaling Engine to use, defaults to Heroku API scaler -
config.min_scale_time_secs
- Minimum scaling rate-limit (only scale up/down this often), defaults to 5 mins (300sec). Rate limit is per-Dyno Scaler, so web and worker dynos have their own rate limits.
Sources and Config
Heroku Vector is designed to have a modular system for pulling traffic data from a variety of sources. The current sources are listed below, with any settings they need:
NewRelic
Minimum Period: 60sec
Unit: Requests per Minute (RPM)
Default Sample Size: Last 5 Minutes (300 samples)
This data source pulls the Throughput in Request per Minute (RPM) from the NewRelic API for the last minute of data. NewRelic metrics tend to lag 3-5 minutes behind realtime.
-
config.newrelic_api_key
/ENV['NEWRELIC_API_KEY']
- REQUIRED - Secret API Key for your newrelic account. -
config.newrelic_account_id
/ENV['NEWRELIC_ACCOUNT_ID']
- OPTIONAL - ID of NewRelic Account to pull from, defaults to first account. -
config.newrelic_app_id
/ENV['NEWRELIC_APP_ID']
- OPTIONAL - ID of NewRelic Application to pull from, defaults to first Application in the account.
Sidekiq
Minimum Period: 1sec
Unit: # busy threads
Default Sample Size: Last 2 Minutes (120 samples)
This data source samples the realtime number of busy worker threads in a sidekiq cluster, using the Sidekiq data API. This value can be sampled continuously.
-
config.sidekiq_redis_url
/ENV['REDIS_URL']
- OPTIONAL - URL for Redis backing Sidekiq cluster, defaults tolocalhost
on6379
-
config.sidekiq_redis_namespace
/ENV['SIDEKIQ_REDIS_NAMESPACE']
- OPTIONAL - Redis namespace for Sidekiq cluster, default is no namespace used.
heroku_vector daemon:
[master ~/src/heroku-vector]$> ./bin/heroku_vector --help
heroku_vector: auto-scale dynos on Heroku
Usage: heroku_vector [options]
heroku_vector -s
-s, --sample Sample values and exit
-d, --daemonize Daemonize process
-e, --envfile PATH Environment file (default: .env)
-c, --config PATH Config file (default: config.rb)
-p, --pidfile PATH Daemon pid file (default: heroku_vector.pid)
-x, --loglevel LEVEL Logging level [fatal/warn/info/debug] (default is info)
-l, --logfile PATH Logfile path for daemon
-h, --help Show this message
Once you've configured your API keys and host names, try taking a sample from all your sources:
$> heroku_vector -s
heroku_vector: {:daemonize=>false, :envfile=>"/Users/wpeterson/src/heroku-vector/.env", :config=>"/Users/wpeterson/src/heroku-vector/config.rb", :sample=>true}
HerokuVector::Source::NewRelic: 8490.0 RPM
HerokuVector::Source::Sidekiq: 23 busy threads
Logging and Debugging
Heroku Vector logs either to STDOUT
or a logfile with useful information about state changes and scaling events.
2014-09-12T16:05:48.473Z INFO: Loading config from '/home/ubuntu/polar-auto-scale/config.rb'
2014-09-12T16:05:48.474Z INFO: Loading Scaler: web, {:source=>HerokuVector::Source::NewRelic, :period=>60, :min_dynos=>1, :max_dynos=>4, :min_value=>1000, :max_value=>3000}
2014-09-12T16:05:48.474Z INFO: Loading Scaler: worker, {:source=>HerokuVector::Source::Sidekiq, :period=>5, :min_dynos=>1, :max_dynos=>10, :min_value=>0.5, :max_value=>3, :scale_up_by=>3, :scale_down_by=>1}
2014-09-12T16:45:01.742Z INFO: Heroku.scale_dynos(worker, 4)
2014-09-12T16:50:03.164Z INFO: worker: 4 dynos - 0.4 busy threads below 2.0 - scaling down
2014-09-12T16:50:03.247Z INFO: Heroku.scale_dynos(worker, 3)
2014-09-12T16:55:04.559Z INFO: worker: 3 dynos - 0.2 busy threads below 1.5 - scaling down
2014-09-12T16:55:04.646Z INFO: Heroku.scale_dynos(worker, 2)
2014-09-12T17:00:08.391Z INFO: worker: 2 dynos - 0.9 busy threads below 1.0 - scaling down
2014-09-12T17:00:08.492Z INFO: Heroku.scale_dynos(worker, 1)
If you're debugging a problem, you can turn on verbose logging by setting the debug
log level:
$ heroku_vector -x debug
Architecture
The auto-scaler runs as a single process, either interactively or as a daemon (ProcessManager
). Within that process, the Worker
spawn an EventMachine event loop and runs each DynoScaler
in it's own thread. Periodically, each DynoScaler
will sample data and evaluate the scale of your dynos. When the scale of your dynos doesn't match your traffic, the DynoScaler
will use the Heroku API to scale your dynos up or down.
Adding a new Source
Data is sampled from generic Source
classes in the Ruby namespace HerokuVector::Source
and wired up in your configuration. Source objects have a simple contract:
-
#sample
- returns a numeric value for this source at this time -
#units
- string that describes the data, like 'RPM' for NewRelic
You can define your own Source
classes and then reference them within your config:
# my_data_source.rb
module HerokuVector::Source
class MyDataSource
def sample
# Always returns 1, simple test case
1
end
def units
'foos'
end
end
end
# Wire Up Your Source in config.rb
require 'path/to/my_data_source'
HerokuVector.configure do |config|
# Scale 1 web dyno for each 300rpm of traffic
config.add_dyno_scaler('web', {
source: HerokuVector::Source::MyDataSource,
period: 1
})
Contributors: ❤️
- Your Name Could Go Here
Contributing
- Fork it
- Bundle Install (
bundle install
) - Run the Tests (
rake test
) - Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request