http_archive
Interaction with HTTP Archives, loosely following the interface of the Archive::HAR Perl module. This Ruby library provides an API to HTTP Archive (.har) files. .har files are JSON-formatted files which contain data about browser interaction with a web page, such as load data, cookie data, load times and URLs of artifacts.
More on HTTP Archives:
Wikipedia
HAR Specification
Installation
$ gem install http_archive
to use in your script:
require 'http_archive'
Usage
http_archive encapsulates all parts of a .har-file as an 'archive' object and fields of that object. To start, you create an HttpArchive::Archive object with a String or File as parameter like so:
archive = HttpArchive::Archive.new(File.open('/myfile.har', 'r'))
The 'archive' object now holds data from the archive as attributes, namely the following fields which are objects themselves:
- A 'browser' object with 'name' and 'version' fields
- A 'creator' object with 'name' and 'version' fields
- A 'pages'-array with 'page' objects
- An 'entries'-array with 'entry' objects
A 'page' object holds data about the page interaction as a whole: When the interaction started, page title, overall load time and so forth. An 'entry' object holds information about every interaction made for that page, such as: duration, request and response data, duration, server information...
To interact with that data, you query the 'archive' object. Examples:
puts archive.browser.name # => "Firefox"
puts archive.creator.name # => "Firebug"
all_pages = archive.pages # normally just one
puts all_pages.first.on_load # => 6745, 6.7 seconds load time
The 'entry' objects hold data of the requests and responses as objects as well, with the header data as a Hash:
first_entry = archive.entries.first
puts first_entry.request.http_method # => "GET"
puts first_entry.request.headers['Connection'] # => "keep-alive"
# More interesting stuff is here:
puts first_entry.response.content # => {"mimeType"=>"text/html", "size"=>0}
puts first_entry.response.status # => 200
puts page.entry.time # => 54
For more info, see the docs for each class. To summarize the data, http_archive can print out a table representation like the one you are used to from Firebug et cetera:
archive.print_table
gives you this:
Metrics for: 'Software is hard', 26 Requests, 0.36MB downloaded. Load time: 6.745s
GET http://www.janodvarko.cz/ 302 Moved Temporarily 0.0 KB 0.054s
GET /index.php 301 Moved Permanently 0.0 KB 0.469s
GET http://www.janodvarko.cz/blog/ 200 OK 52.95 KB 2.593s
GET /l10n.js?ver=20101110 200 OK 0.31 KB 0.065s
GET /prototype.js?ver=1.6.1 200 OK 139.85 KB 1.06s
GET /wp-scriptaculous.js?ver=1.8.3 200 OK 2.94 KB 0.136s
GET /effects.js?ver=1.8.3 200 OK 38.47 KB 0.665s
GET /geshi.css 200 OK 1.03 KB 0.261s
GET /lightbox.css 200 OK 1.42 KB 0.269s
GET /lightbox.js 200 OK 23.84 KB 0.55s
GET /rss.gif 200 OK 0.62 KB 1.114s
GET /x.png 200 OK 1.37 KB 1.151s
GET /useincommandline.png 200 OK 21.55 KB 2.488s
GET /red-text.png 200 OK 18.97 KB 2.778s
GET /simple-log.png 200 OK 27.14 KB 3.135s
GET /start-button.png 200 OK 11.29 KB 2.29s
GET /wordpress.gif 200 OK 0.52 KB 2.316s
GET /creativebits.gif 200 OK 0.34 KB 2.323s
GET /urchin.js 200 OK 22.68 KB 1.476s
GET /style.css 200 OK 9.24 KB 0.43s
GET /quote.gif 200 OK 1.62 KB 1.716s
GET /sidebar_top.gif 200 OK 0.11 KB 1.767s
GET /sidebar_bottom.gif 200 OK 0.11 KB 1.797s
GET /&utmac=UA-3586722-1&utmcc=__u 200 OK 0.04 KB 1.318s
GET /loading.gif 200 OK 2.77 KB 0.071s
GET /closelabel.gif 200 OK 0.98 KB 0.089s
Contributing
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request