Project

redback

0.0
No commit activity in last 3 years
No release in over 3 years
Fetches a URL you give it and recursively searches for all URLs it can find, building up a list of unique URLs on the same hostname.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

>= 0.8.6
>= 0.6.4
 Project Readme

redback

Redback is a Ruby spider (geddit?). Pass it a website, and it will begin its many-legged crawl, scurrying across the site to pull out all the unique URLs it can find.

Just like a terrifying real-life spider, redback aims to be fast: in particular, it sends requests in parallel so one slow page won't slow down your crawl.

Installation

$ gem install redback

Usage

Command line

$ redback http://example.com/

…in which case it will print all the URLs it finds within the site http://example.com/.

You can output he results to a file like this:

$ redback http://example.com > output.txt

Or feed them to another command line tool like this:

$ redback http://xkcd.com | grep xml

Within Ruby

It can also be used as a library:

require 'redback'

Redback.new "http://example.com" { |url| puts url }

The Redback.new method accepts a URL and a block; the block will be executed for each URL found.