0.0
No release in over 3 years
Low commit activity in last 3 years
A command line parser for web server logs
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 2.1.4
~> 5.0
~> 10.0
 Project Readme

WebLog Parser - Readme

Installing

gem install weblog-parser

Usage

WebLog Parser reads a webserver logfile and counts page visits and unique page views. It uses a command-line interface.

Getting Started

An example log file can be found at https://tinyurl.com/ve8x3qs You can download this file and name it 'weblog.log'. Then run the command:

wlparser -i -f weblog.log

This will read the log file and validates either ip4 or ip6 addresses. The file contains both, so if you don't use the -i option, it will show some errors as the default is ip4.

Alternatively, the app can be run from the cloned repo by running the bin\wlparser file.

Options

wlparser -h wlparser --help

Shows a list of options

wlparser -f logfile.log
wlparser --file logfile.log

Reads a log file and display results:

wlparser -m 'logfile1.log logfile2.log'
wlparser --multiple_files 'logfile1.log logfile2.log'

Reads a list of log files in quotes and displays results. All files give using -f or -m options will be read and the output combined.

If no files are specified, the default file 'webserver.log' will be read.

wlparser -c wlparser --color

Displays colored text output. Colors can be change in Constants.rb.

wlparser -C wlparser --no_color

Disables colored text output.

wlparser -v wlparser --verbose

Shows extra information, including all validation warnings.

wlparser -q wlparser --quiet

Displays minimal information i.e. only important warnings. Will still write information to a file if this option is selected. Disables verbose.

wlparser -o wlparser --output_file info.txt

Writes output to file. Default is 'log_info.txt' if no file chosen, although this will only work if this is the last argument given.

wlparser -t wlparser --timestamp

Adds a timestamp to the output file. If an output file is given that already exists, this is turned on automatically.

wlparser -x wlparser --text

Sets file output format to text, similar to that displayed (default).

wlparser -j wlparser --json

Sets file output format to json.

wlparser -4 wlparser --ip4_validation

Validates ip addresses using ip4 format (default).

wlparser -6 wlparser --ip6_validation

Validates ip addresses using ip6 format.

wlparser -6 wlparser --ip4ip6_validation

Validates ip addresses if they matches either ip4 or ip6 format.

wlparser -I wlparser --no_ip_validation

Does not validate ip addresses, assumes they are all valid.

wlparser -p wlparser --path_validation

Validates webpage path (default).

wlparser -P wlparser --no_path_validation

Does not validate webpage path, assumes they are all valid.

wlparser -r wlparser --remove_invalid

Ignore logs if either ip address or path is invalid.

wlparser -R wlparser --warn_invalid

Warns about logs with invalid ip addresss or path, but still reads them (default)

wlparser -g wlparser --page_visits

Displays page visits in results and in text file output (default).

wlparser -g wlparser --page_visits

Does not display page visits in results or text file output.

wlparser -u wlparser --unique_page_views

Displays unique page views in results and in text file output (default).

wlparser -U wlparser --no_unique_page_views

Does not display unique page views in results or text file output (default).

##Log Format

Logs should be on separate lines. There should be a space separator between the webpage path and the ip address.

Example log with ip4 address: \webpage\index 123.123.123.123

Logs can use either using ip4 addresses or ip6 addresses.

ip4 addresses should be valid i.e. between 0.0.0.0 and 255.255.255.255, although you can skip this check.

Example log with ip6 address: \webpage\index 1234:1234:1234:1234:1234:1234:1234:1234

ip6 addresses can be compressed e.g.

\webpage\index 1234:1234::1234

Testing

Tests can be run from the cloned repository using:

rake test

The git repository is: https://github.com/davidmorton0/WebLogParser

Tests have been separated into

  • unit tests - test methods in each class
  • integration tests - test the whole app
  • performance - parses a log file with 10,000 logs and a log with 100 logs 100 times. Calculates the time taken and logs parsed/second. The log files are a mixture of ip4 and ip6 addresses.

App structure

A class diagram can be found here: https://tinyurl.com/tky2f74 Note that the dependencies to Constants are not shown.

Executables

  • wlparser - Starts the app

Classes

  • Parser - Holds the log information and changes the format
  • LogReader - Loads files then reads logs. Validates logs, ip addresses and paths
  • ipValidator - Validates ip addresses
  • PathValidator - Validates the path for the webpage
  • OptionHandler - Sets the options from the command line arguments given
  • Formatter - Formats information for text or display output
  • OutputProcessor - Assembles the information for output
  • WarningHandler - Handles the warnings found when parsing the logs

Modules

  • LogParser - Calls the methods in order
  • Constants - Contains default options and other constants used in the app
  • TestData - Contains the data used in the tests
  • ColorText - Adds color to text
  • Version - Gives the current version number

Logs

  • test_logs contains log files used in testing