0.07
No release in over 3 years
Low commit activity in last 3 years
There's a lot of open issues
Ruby Gem to parse sitemaps.org compliant sitemaps.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

Runtime

~> 1.6
>= 0.6, < 2.0
 Project Readme

Sitemap Parser

Ruby Gem to parse sitemaps.org compliant sitemaps

Build Status Gem Version

Usage

Create a new instance of the Parser:

sitemap = SitemapParser.new "http://ben.balter.com/sitemap.xml"

Extract the URLs of the sitemap

sitemap.urls # => Array of Nokigiri XML::Node objects
sitemap.to_a # => Array of url strings

Options

Recurse nested sitemaps

sitemap = SitemapParser.new('http://ben.balter.com/sitemap.xml', {recurse: true})

Or if you only want to extract only sitemap urls maching a given pattern, you can provide a regex that will be used to match each page.

sitemap = SitemapParser.new('http://ben.balter.com/sitemap.xml', {recurse: true, url_regex: /sitemapregex/})

Typhoeus Options

sitemap = SitemapParser.new('http://ben.balter.com/sitemap.xml', { userpwd: "username:password" })