Service Downtime Simulator
🚨 If you work at Deliveroo and you're contributing to this project, please bear in mind that this repository is public.
This is a piece of Rack middleware that simulates failures you would want to tolerate in upstream services.
Installation
Rails
Add the following in application.rb
:
config.middleware.use(
ServiceDowntimeSimulator::Middleware,
config # See below for info about how to configure this
)
Configuration
The middleware takes a config
argument in the form of a hash. Said hash should have the following shape:
{
enabled: Boolean,
mode: Symbol,
excluded_paths: Array<String>,
logger: Logger?
}
Here's what you can supply for each of those options:
-
enabled (
Boolean
)-
true
will enable simulation of failures (assuming you supply a validmode
, see below) -
false
will disable simulation and your application will function as normal
-
-
mode (
Symbol
)-
:hard_down
will cause all requests to return a 500 error -
:intermittently_down
will cause 50% of requests to return a 500 error -
:successful_but_gibberish
will return a 200, but with a response body that is not machine readable -
:timing_out
will wait for 15 seconds on each request, and then return a 503
-
-
excluded_paths (
Array<String>
)- You can supply a list of paths that you don't want to be affected by the simulation here (e.g.
['/foobar']
) - The most common thing you're going to want to include here is your service's health check endpoint, as if it is returning a 5xx thanks to this middleware your application will not deploy
- You can supply a list of paths that you don't want to be affected by the simulation here (e.g.
-
logger (
Logger?
)- If supplied, useful debug information will be sent here
In order for the middleware to kick in, enabled
must be explicitly set to true
and mode
must be a valid option. Unless both are explicitly supplied, the underlying application will continue to function as normal.
Examples
Here's a couple of example configurations:
Hard-coded Hard Down
This example will always return a 500 for all requests.
config.middleware.use(
ServiceDowntimeSimulator::Middleware,
{
enabled: true,
mode: :hard_down,
excluded_paths: ['/health'],
logger: Rails.logger
}
)
Environment-variable Controlled Simulation
This is a more practical example, allowing failure simulation to happen based on environment variables. It requires an environment variable with a specific value to enable the failure simulation, and also requires a mode to be provided. If either are missing, the app continues as normal. You can also use this pattern for feature flagging. Probably.
config.middleware.use(
ServiceDowntimeSimulator::Middleware,
{
enabled: ENV['FAILURE_SIMULATION_ENABLED'] == 'I_UNDERSTAND_THE_CONSEQUENCES_OF_THIS',
mode: ENV.fetch('FAILURE_SIMULATION_MODE', '').to_sym,
excluded_paths: ['/health'],
logger: Rails.logger
}
)
Development
- Clone this repository
- Ensure you have Ruby 2.5.1 installed
-
make install
to get the dependencies -
make test
to run the tests -
make lint
to lint your code - ???
- Profit
Gem Publishing
TBC, but very manual and involved flow is:
- Update version in
lib/service_downtime_simulator.rb
and commit - Tag version via
git tag XXX
- Push (
git push origin head --tags
) - Release to Rubygems (
make publish
)