0.0
No release in over 3 years
Low commit activity in last 3 years
Map all links on a given site.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 1.3
~> 0.7
~> 10.3
~> 3.2
~> 3.1
~> 0.8

Runtime

 Project Readme

SiteMapper

Code Climate Coverage Status Docs badge Build Status Dependency Status Gem Version

Map all links on a given site.
SiteMapper will try to respect /robots.txt

Works great with Wayback Archiver a gem that crawls your site and submits each URL to the Internet Archive (Wayback Machine).

Installation

Install the gem:

gem install site_mapper

Usage

Command line usage:

# Crawl all found links on page
# that has example.com domain
site_mapper example.com

Ruby usage:

# Crawl all found links on page
# that has example.com domain
require 'site_mapper'
SiteMapper.map('example.com') do |new_url|
  puts "New URL found: #{new_url}"
end
# Log to STDOUT
SiteMapper.map('example.com', logger: :system) do |new_url|
  puts "New URL found: #{new_url}"
end

Docs

You can find the docs online on RubyDoc.

This gem is documented using yard (run from the root of this respository).

yard # Generates documentation to doc/

Contributing

Contributions, feedback and suggestions are very welcome.

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

Notes

  • Special thanks to the robots gem, which provided the bulk of the code in lib/robots.rb

Alternatives

There are a couple of great alternatives, which are more mature and has more features than this Gem and has. Please feel free to check them out:

License

MIT License