Release 1.1.1 [2013-02-11 13:14]
-
Update gemspec
-
extract the translator in it’s own class
-
add option to recalculate the ‘206 Partial Content’ issue on S3
(see forums.aws.amazon.com/thread.jspa?threadID=54214 for more details)
Release 1.1.0 [2011-05-08]
-
Switched to Fileutils for 1.9 compatibility
Synopsis¶ ↑
Download, merge and convert Amazon S3 bucket log files for a specified date or date range.
-
Download S3 bucket log files produced by Amazon S3. Log files are downloaded once and cached locally.
-
Merge those log files together into a single logfile per bucket (sorting on ascending timestamp)
-
Convert the log file from Amazon Server Access Log Format to Apache Common Log Format
Ralf is an acronym for Retrieve Amazon Log Files. Ralf does the following things:
Usage¶ ↑
Usage: ./bin/ralf [options] Download and merge Amazon S3 bucket log files for a specified date range and output a Common Log File. Ralf is an acronym for Retrieve Amazon Log Files. Ralf downloads bucket log files to local cache directories, merges the Amazon Log Files and converts them to Common Log Format. Example: ./bin/ralf --range month --now yesterday --output-file '/var/log/amazon/:year/:month/:bucket.log' AWS credentials (Access Key Id and Secret Access Key) are required to access S3 buckets. For security reasons these credentials can only be specified in a configuration file (see --config-file) or through the environment using the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables. Log selection options: -l, --[no-]list List buckets that have logging enabled. Does not process log files. -b, --buckets x,y,z Buckets for which to process log files. Defaults to all log-enabled buckets. -r, --range BEGIN[,END] Date or date range to process. Defaults to 'today'. -t, --now TIME Date to use as base for range. Defaults to 'today'. You can use Chronic expressions for '--range' and '--now'. See http://chronic.rubyforge.org. Example: --range 'last week' All days of previous week. Example: --range 'this week' Beginning of this week (sunday) upto and including today. Example: --range '2010-01-01','2010-04-30' First four months of this year. Example: --range 'this month' --now yesterday This will select log files from the beginning of yesterday's month upto and including yesterday. The --buckets, --range and --now options are optional. If unspecified, (incomplete) logging for today will be processed for all buckets (that have logging enabled). This is equivalent to specifying "--range 'today'" and "--now 'today'". Output options: -o, --output-file FORMAT Output file, e.g. '/var/log/s3/:year/:month/:bucket.log'. Required. The --output-file format uses the last day of the range specified by (--range) to determine the filename. E.g. when the format contains ':year/:month/:day' and the range is 2010-01-15..2010-02-14, then the output file will be '2010/02/14'. -x, --cache-dir FORMAT Directory name(s) in which to cache downloaded log files. Optional. The --cache-dir format expands to as many directory names as needed for the range specified by --range. E.g. "/var/run/s3_cache/:year/:month/:day/:bucket" expands to 31 directories for range 2010-01-01..2010-01-31. Defaults to '~/.ralf/:bucket' or '/var/log/ralf/:bucket' (when running as root). Config file options: -c, --config-file [FILE] Path to file with configuration settings (in YAML format). Configuration settings are read from the (-c) specified configuration file or from ~/.ralf.conf or from /etc/ralf.conf (when running as root). Command-line options override settings read from the configuration file. The configuration file must be in YAML format. Each command-line options has an equivalent setting in a configuration file replacing dash (-) by underscore(_). The Amazon Access Key Id and Secret Access Key can only be specified in the Example: output_file: /var/log/amazon_s3/:year:month/:bucket.log aws_access_key_id: my_access_key_id aws_secret_access_key: my_secret_access_key To only use command-line options simply specify -c or --config-file without an argument. Debug options: -d, --[no-]debug [aws] Show debug messages. Common options: -h, --help Show this message. -v, --version Show version.
Library¶ ↑
You can also use Ralf from within your own ruby code. Each command-line option has a corresponding option in the options has passed to Ralf.new and Ralf.run. Replace a dash (-) by an underscore (_) in the names:
options = { :output_file => '/var/log/s3/:bucket.log' } require 'rubygems' require 'ralf' r = Ralf.new({ :config_file => '/Users/me/ralf.yaml' }.merge(options)) r.run
Or run it in one go:
Ralf.run({ :config_file => '/Users/me/ralf.yaml' }.merge(options))
Requirements¶ ↑
-
Credentials for an Amazon S3 account
-
Enable logging on S3 You can use Cyberduck for example.
Gem dependencies¶ ↑
Ralf depends on the following gems which will automatically installed when you install the ralf gem.
-
chronic
-
right_aws
-
logmerge
Authors¶ ↑
Authors: Leon Berenschot and K.J. Wierenga
Contributers: Victor Castell
This program is used for kerkdienstgemist.nl Amazon S3 log file processing.