AppStatus
AppStatus is a Rails engine which makes it easy to expose application status data in a way easily consumed by Nagios or other monitoring packages.
Build Status
Why?
Defining health checks outside of your application (like in Nagios) has a few different problems.
- The people who maintain nagios aren't necessarily the same people who maintain the application.
- Keeping the 2 systems in sync can be non-trivial with a fast-changing application.
- Failing to monitor new features, or monitoring the wrong things, leads to a false sense of security.
Instead, app_status lets you define your health checks right in the application itself and expose the results as a JSON service which is easy for Nagios to consume.
The benefits basically come down to 1 major thing: Nagios doesn't need to know anything about your application. All Nagios needs is a 'healthy/not healthy' status report.
This is good because:
- As your app's feature set changes, you can deploy updated health checks at the same time. No need for coordinated updates between the app and the monitoring system.
- Credentials for external services (like databases) can stay with your app. Nagios doesn't need them.
- You don't need nrpe to do local process checks. Your application can do them for itself.
- Your health checks can be testable methods just like all your other code.
- You don't need to duplicate complex queries & other business logic over to Nagios.
Installation
Gemfile
gem 'app_status'
config/routes.rb
mount AppStatus::Engine => "/status(.:format)", defaults: {format: 'json'}
This exposes http://localhost:3000/status.json
. (Support for HTML output was
dropped in v2.0. see PR #3.)
config/initializers/app_status.rb
This is where you set up the checks which you want to be run when
someone hits the URL above. Set up some calls which evaluate the health
of your application and call add_check
for each one.
add_check
add_check
expects a service name, plus a block to be evaluated to determine
the health of that service. The block should return either a status value, or
a 2-element array with status and some details.
Rails.application.config.to_prepare do
AppStatus::CheckCollection.configure do |c|
c.add_check('some_service') do
details = do_something_to_check_your_service
status = (details != "FAIL") ? :ok : :critical
[status, details]
end
c.add_check('failing_service') do
:critical # you can return just a status if desired.
end
end
end
The details string should be concise. app_status
does its best to provide
readable output, and Nagios does its best to make this impossible to actually
do well.
Valid status values (in ascending order of seriousness) are:
- :ok
- :warning
- :critical
- :unknown
These are set up to be compatible with Nagios.
add_description
add_description
allows you to specify extended description and troubleshooting
information for any check which has been added via add_check
.
These descriptions are not included in the status.json
payload by default.
Add ?descriptions=1
to include them.
AppStatus::CheckCollection.configure do |c|
c.add_check('some_service') do
[:critical, 'what is going on']
end
c.add_description 'some_service', <<~EOF
some_service failures indicate that some_service is going wrong.
this is handy since nagios really requires brief output, but sometimes you need
more space to explain what a check is.
think of it as the answer to the problem of "That guy is on vaction, but his
app is raising alarms. WTF do I do?"
EOF
end
Keep in mind that anyone who hits your status URL can cause your checks to run, so if they expose sensitive data or are a potential DOS vector you should probably protect them with some kind of authentication.
Built-In Checks
As of version 2.0.0, app_status started including a set of built-in checks
which can be installed. Have a look in
lib/app_status/checks
for a full list.
ruby_version
check
Verifies that the running version of ruby is as expected.
Default is to read the expected version from a .ruby-version
file in the
rails root directory.
# config/initializers/app_status.rb
require 'app_status/checks/ruby_version'
AppStatus::Checks::RubyVersion.install!
If you wish to specify the expected version string by another method, that's also supported.
# config/initializers/app_status.rb
require 'app_status/checks/ruby_version'
AppStatus::Checks::RubyVersion.install!(expected_version: '2.5.0')
Usage
$ curl -H 'Accept: application/json' http://localhost:3000/status
Output will look something like this:
{
"status": "critical",
"status_code": 2,
"ms": 52,
"finished": "2013-10-03T21:28:10Z",
"checks": {
"some_service": {
"status": "ok",
"status_code": 0,
"details": "Looks good!",
"ms": 30
},
"failing_service": {
"status": "critical",
"status_code": 2,
"details": "",
"ms": 20
}
}
}
The overall status will be the worst value observed in your individual checks.
Nagios Integration
check_app_status.rb
is a Nagios check script which can be used to monitor the output from app_status
$ ./check_app_status.rb --help
Nagios check script for app_status. See https://github.com/alexdean/app_status
-v, --verbose Output more information
-V, --version Output version information
-h, --help Display this screen
-u, --url VAL Url to monitor
-a, --auth VAL HTTP basic auth in the form 'user:password'
-t, --timeout VAL Timeout after waiting this long for a response.
The script's exit status is derived from the overall status returned by the server. Individual detail items will be grouped by status for display. (Unknowns are displayed together, then criticals, then warnings, then OKs.)
Sample output
$ ./check_app_status.rb --url http://localhost:3000/status
CRIT failed_service
--- failed_service: shit's on fire yo, 501ms
WARN problematic_service
--- problematic_service: not looking good, 2001ms
OK ok_process, ok_process_2
--- ok_process: these are some details, 0ms
--- ok_process_2: more details on another process, 0ms