Project

ruby-web-crawler

0.0

No commit activity in last 3 years

No release in over 3 years

ruby-web-crawler dwijen-purohit/ruby-web-crawler Homepage Documentation Source Code Bug Tracker Wiki

A simple ruby gem to recursively traverse all URLs on a Root URL. It returns all the URLs it encountered

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

2025

Popularity

3,413

Releases

0.1.0

2016-06-16

2016-06-16

Development

Primary Language

Ruby

Licenses

MIT

Average date of last 50 commits

2016-06-15

Reverse Dependencies

Dependencies

Development

uri

>= 0

Project Readme

ruby-web-crawler

A simple Web Crawler to recursively follow links and get all the child Links. It also supports URL forwarding caused by redirection.

###Installation

gem install ruby-web-crawler

###Using the Gem

crawler = RubyWebCrawler.new 'http://example.com'
urls = crawler.start_crawl    # Returns an Array of all URLs

###Configuration

Change number of URLs the crawler will traverse before it stops

crawler = RubyWebCrawler.new 'http://example.com', 100

crawler.url_limit = 100  # Default 50 (Applicable before calling `start_crawl`)

Change the execution time limit of the Crawl (In Seconds)

crawler = RubyWebCrawler.new 'http://example.com', 100, 120

crawler.time_limit = 120  # Default 60 (Applicable before calling `start_crawl`)