Project

log_sense

0.0
The project is in a healthy, maintained state
Generate analytics in HTML, txt, and SQLite format for Rails and Apache/Nginx log files.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 5.24.0
~> 1.9.0

Runtime

~> 5.3.0
~> 1.2.0
~> 2.0.0
 Project Readme

LogSense Readme - Monitor your Rails app easy and fast

Introduction

LogSense generates reports and statistics from Ruby on Rails and Apache/Nginx log files.

Main features:

  • Statistics for Rails app in production and Web server logs (combined format, which can be produced both by Apache and Nginx)
  • Reports on performances, errors, visitors, and devices used to access your websites and webapps[fn:: LogSense parses also the data generated by the BrowserInfo gem, providing additional information for Rails apps, including devices, platforms and number of accesses to methods by device type.].
  • Can combine one or more log files
  • No need for cookies or other tracking technologies (but you need access to your log files)
  • Filters allow to analyze specific periods distinguish traffic generated by self polls and crawlers.
  • Reports can be generated in HTML, txt, ufw, and SQLite. HTML reports are responsive and come with dark and light theme.

LogSense is Written in Ruby, it runs from the command line, it is fast, and it can be installed on any system with a relatively recent version of Ruby. We use it with Ruby 3.1.4 and 3.3.0.

It is fast. On a ThinkPad P16, a 277M log file is parsed in 15 seconds, processing, that is, about 7740 events per second; a 569M log file is parsed in 50 seconds, that is, about 4700 events per second.

Rails Production Report

./screenshots/rails-screenshot.png

LogSense understands the Rails production log and generates the following reports in TXT and HTML:

  • Daily Distribution
  • Time Distribution
  • Statuses
  • Statuses by Day
  • Rails Performance
  • Controller and Methods by Device
  • Fatal Events
  • Fatal Events
  • Fatal Events (grouped by type)
  • Job Error
  • Job Errors (grouped)
  • Browsers
  • Platforms
  • IPs
  • Countries
  • IP per hour
  • Sessions

Apache/Nginx Report

./screenshots/combined_log-screenshot.png

LogSense reads the Apache/Nginx combined log format and generates the following reports in TXT and HTML:

  • Time Distribution
  • 20_ and 30_ on HTML pages
  • 20_ and 30_ on other resources
  • 40_ and 50_x on HTML pages
  • 40_ and 50_ on other resources
  • 40_ and 50_x on HTML pages by IP
  • 40_ and 50_ on other resources by IP
  • Statuses
  • Statuses by Day
  • Browsers
  • Platforms
  • IPs
  • Countries
  • IP per hour
  • Combined Platform Data
  • Referers
  • Sessions

UFW Report

The ufw output format generates directives for Uncomplicated Firewall, blacklisting IPs requesting URLs matching a given pattern.

We use it to blacklist IPs requesting WordPress login pages on our websites… since we don’t use WordPress for our websites.

Example

$ log_sense -f apache -t ufw -i apache.log
# /users/sign_in/xmlrpc.php?rsd
ufw deny from 20.212.3.206

# /wp-login.php /wordpress/wp-login.php /blog/wp-login.php /wp/wp-login.php
ufw deny from 185.255.134.18

...

Installation

gem install log_sense

If you want to collect information about browsers, platform and devices when generating Rails reports, add the browser gem to your bundle and the following code to application_controller.rb:

# Gemfile
gem "browser"
# application_controller.rb
class ApplicationController < ActionController::Base

  # [...]

  before_action do |controller|
    user_agent = request.env['HTTP_USER_AGENT']
    ip = request.env['REMOTE_ADDR']

    hashed_ip = Digest::SHA256.hexdigest ip
    b = Browser.new(user_agent)
    now = DateTime.now

    logger = Rails.logger
    browser_data = [
      b.name, b.platform, b.device.name,
      controller.class.name, controller.action_name,
      request.format.symbol,
      hashed_ip,
      now
    ]

    browser_data_str = browser_data.map { |x| "\"#{x}\"" }.join(',')
    logger.info "BrowserInfo: #{browser_data_str}"
  end

  # [...]
end

Usage

log_sense --help
Usage: log_sense [options] [logfile ...]
        --title=TITLE                Title to use in the report
    -f, --input-format=FORMAT        Log format (stored in log or sqlite3): rails or apache (DEFAULT: apache)
    -i, --input-files=file,file,     Input file(s), log file or sqlite3 (can also be passed as arguments)
    -t, --output-format=FORMAT       Output format: html, txt, sqlite, ufw (DEFAULT: html)
    -o, --output-file=OUTPUT_FILE    Output file. (DEFAULT: STDOUT)
    -b, --begin=DATE                 Consider only entries after or on DATE
    -e, --end=DATE                   Consider only entries before or on DATE
    -l, --limit=N                    Limit to the N most requested resources (DEFAULT: 100)
    -w, --width=WIDTH                Maximum width of long columns in textual reports
    -r, --rows=ROWS                  Maximum number of rows for columns with multiple entries in textual reports
    -p, --pattern=PATTERN            Pattern to use with ufw report to select IP to blacklist (DEFAULT: php)
    -c, --crawlers=POLICY            Decide what to do with crawlers (applies to Apache Logs)
        --no-selfpoll                Ignore self poll entries (requests from ::1; applies to Apache Logs) (DEFAULT: false)
        --no-geo                     Do not geolocate entries (DEFAULT: true)
        --verbose                    Inform about progress (output to STDERR) (DEFAULT: false)
    -v, --version                    Prints version information
    -h, --help                       Prints this help

This is version 2.0.0

Output formats:

- rails: txt, html, sqlite3, ufw
- apache: txt, html, sqlite3, ufw

Examples:

log_sense -f apache -i access.log -t txt > access-data.txt
log_sense -f rails -i production.log -t html -o performance.html

Motivation

LogSense focuses on privacy, data-ownership, and simplicity: no need to install JavaScript snippets, no tracking cookies, just plain and simple log analysis.

LogSense is also inspired by static websites generators: statistics are generated from the command line and accessed as static HTML files. This significantly reduces the attack surface of your web server and installation headaches. We have a cron job running on our servers, generating statistics at night. The generated files are then made available on a private area on the web and rotated monthly.

An important word of warning on SQLite3 output

Log poisoning is a technique whereby attackers send requests with invalidated user input to forge log entries or inject malicious content into the logs.

log_sense sanitizes entries of HTML reports, to try and protect from log poisoning. Log entries and URLs in SQLite3 tables, however, are not sanitized: they are read and stored from the log as they are. This is not, in general, an issue, unless you use the unsanitized data from SQLite as it is in environments where URL can be opened or code executed using the URLs as argument.

Change Log

See the CHANGELOG file.

Compatibility

LogSense should run on any system on which a recent version of Ruby runs. We tested it with Ruby 2.6.9 and Ruby 3.0.x, and Ruby 3.3.x

Author and Contributors

Shair.Tech

Credits

  • HTML reports use Zurb Foundation, Data Tables, and Apache ECharts
  • The textual format is compatible with Org Mode and can be further processed to any format Org Mode can be exported to, including HTML and PDF, with the word of warning in the section above concerning log poisoning.

Code Structure

The code implements a pipeline, with the following steps:

  1. Parser: parses a log to a SQLite3 database. The database contains a table with a list of events, and, in the case of Rails report, a table with the errors.
  2. Aggregator: takes as input a SQLite DB and aggregates data, typically performing “group by”, which are simpler to generate in Ruby, rather than in SQL. The module outputs a Hash, with different reporting data.
  3. GeoLocator: add country information to all the reporting data which has an IP as one the fields.
  4. Shaper: makes (geolocated) aggregated data (e.g. Hashes and such), into Array of Arrays, simplifying the structure of the code building the reports.
  5. Emitter generates reports from shaped data using ERB.

Todo

See todo.org

Known Bugs

We have been running LogSense for quite a few years with no particular issues. There are no known bugs; there is an unknown number of unknown bugs.

You are most welcome to report issues and missing features, using the Issue tracker.

Licenses

LogSense is distributed under the terms of the MIT License.

Geolocation is made possible by dbip’s IP to City database, released under a CC license.

The world map is distributed under the terms of the MIT License by Pareto Softare, Simplemaps.com. It is used in LogSense with some changes to the class names and ids.