Category: HTML parsing - The Ruby Toolbox

96%

73%

2024-09-10

7,602

ox ohler55/ox Homepage Documentation Source Code Bug Tracker

ox

0.88

A long-lived project that still receives updates

A fast XML parser and object serializer that uses only standard C lib. Optimized XML (Ox), as the name implies was written to provide speed optimized XML handling. It was designed to be an alternative to Nokogiri and other Ruby XML parsers for generic XML parsing and as an alternative to Marshal for Object serialization.

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

26,115,001

905

Releases

2.14.18

153

2011-06-30

2024-03-21

Activity

95%

84%

2023-07-01

170

oga Homepage Documentation Source Code Bug Tracker Wiki

oga

0.83

No release in over a year

Oga is an XML/HTML parser written in Ruby.

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

20,000,005

Releases

3.4

2014-09-11

2022-08-02

Activity

119

libxml-ruby xml4r/libxml-ruby Homepage Documentation Source Code Bug Tracker Mailing List Wiki

libxml-ruby

0.65

A long-lived project that still receives updates

The Libxml-Ruby project provides Ruby language bindings for the GNOME Libxml2 XML toolkit. It is free software, released under the MIT License. Libxml-ruby's primary advantage over REXML is performance - if speed is your need, these are good libraries to consider, as demonstrated by the informal benchmark below.

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

25,368,298

129

Releases

5.0.3

126

2006-02-23

2024-03-11

Activity

99%

81%

2023-11-04

274

hpricot Homepage Documentation

hpricot

0.58

No release in over 3 years

a swift, liberal HTML parser with a fantastic library

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

14,066,128

Releases

0.8.6

2006-08-11

2012-01-17

Activity

674

nikkou tombenner/nikkou Homepage Documentation Source Code Bug Tracker Wiki

nikkou

0.03

No commit activity in last 3 years

No release in over 3 years

Extract useful data from HTML and XML with ease!

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

362,486

Releases

0.0.5

2013-04-23

2016-08-27

Activity

50%

50%

2014-05-07

rubyful_soup Homepage Documentation

rubyful_soup

0.0

No release in over 3 years

Rubyful Soup is a *ML parser that makes screen-scraping easy. It won't choke on bad markup, and it's easy to locate the part of a document you want.

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

20,588

Releases

1.0.4

2005-10-21

2006-03-01

Activity

scrubyt Homepage Documentation

scrubyt

0.0

No release in over 3 years

scRUBYt! is an easy to learn and use, yet powerful and effective web scraping framework. It's most interesting part is a Web-scraping DSL built on HPricot and WWW::Mechanize, which allows to navigate to the page of interest, then extract and query data records with a few lines of code. It is hard to describe scRUBYt! in a few sentences - you have to see it for yourself!

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

47,659

Releases

0.4.06

2007-01-14

2008-12-09

Activity