Spider bot¶ ↑
a non-threaded spider bot that spiders a site with response time stats. easily extendable
Installation¶ ↑
sudo gem install ssoroka-spider_bot
Usage ¶ ↑
Usable in code or from terminal:
$ spider_bot http://www.example.com $ spider_bot http://0.0.0.0:3000
Code example of a custom spider script in script/spider:
#!/usr/bin/env ruby require 'rubygems' require 'spider_bot' class MySpider < SpiderBot # override these for handling events def on_page(page) end def on_404(link) end def on_500(link) end # override these for changing how urls are classified as links def off_site?(url) url !~ /^\// # urls not starting with a / end def ignorable?(url) url =~ /\/.*\..+/ && # files with extensions url !~ /\.html$/ # but not html files end end spider = MySpider.new(:quiet => false) spider.start(ARGV[1])
Gem Requirements¶ ↑
-
ssoroka-ansi
-
mechanize