0.0
No commit activity in last 3 years
No release in over 3 years
Tag remover let's you remove all elements of specified tags from extremely large XML documents without parsing or loading the whole thing in memory, useful for processing unreasonably large documents without making your server fall over.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 1.6
~> 10.0
~> 3.1
 Project Readme

TagRemover

Tag remover let's you remove all elements of specified tags from extremely large XML documents without parsing or loading the whole thing in memory, useful for processing unreasonably large documents without making your server fall over.

Installation

Add this line to your application's Gemfile:

gem 'tag_remover'

And then execute:

$ bundle

Or install it yourself as:

$ gem install tag_remover

Usage

The following line will read XML from input_stream, and write it out to output_stream with all div and img elements removed.

TagRemover.process input_stream, output_stream, remove_tags: ['div', 'img']

Options include:

  • remove_tags: List of tags to remove from the XML file.
  • close_streams: (true|false) If set, TagRemover will close input_stream and output_stream once the proccess is over.
  • [NOT IMPLEMENTED] format: (true|false) If set, then the contents of output_stream will be formatted.

TagRemover can be used from the command line with the rmtags command. The following is an example that reads input.xml and writes the output to output.xml, removing all div and img elements:

$ rmtags input.xml output.xml div img

Limitations

Tag remover currently only works correctly if the XML is formatted with only one tag per line.

Contributing

  1. Fork it ( https://github.com/[my-github-username]/tag_remover/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request