Cosmicrawler
Cosmicrawler is crawler library for Ruby. It provides scalable asynchronous crawling by http, file, etc using EventMachine.
Installation
Add this line to your application's Gemfile:
gem 'cosmicrawler'
And then execute:
$ bundle
Or install it yourself as:
$ gem install cosmicrawler
Usage
http
require 'cosmicrawler'
Cosmicrawler.http_crawl(%w(http://example.com/1 http://example.com/2)) {|request|
get = request.get
puts get.response if get.response_header.status == 200
}
require 'cosmicrawler'
require 'em-http-request'
Cosmicrawler.each(%w(http://example.com/1 http://example.com/2)) {|item|
request = EM::HttpRequest.new(item)
get = request.get
puts get.response if get.response_header.status == 200
}
Contributing
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request
License
Ruby's License