0.0
No commit activity in last 3 years
No release in over 3 years
Ruby gem for surgical changes in HTML fragment strings, using Nokogiri, with optional audit trail in html attributes
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

~> 1.10
~> 10.0
>= 0

Runtime

 Project Readme

HtmlSurgeon

Gem Version Build Status Code Climate Code Climate Coverage

Make specific changes in a HTML string, optionally adding html attributes with the audit trail of the changes. Uses Nokogiri.

Basic Usage

First, you create a HtmlSurgeon service instance for the given html fragment

GIVEN_HTML = <<-HTML
<div>
    <h1>Something</h1>
    <div id="1" class="lol to-be-changed">1</div>
    <span>Other</span>
    <div id="2" class="another to-be-changed">
        <ul>
            <li>1</li>
            <li>2</li>
        </ul>
    </div>
</div>
HTML

surgeon = HtmlSurgeon.for(GIVEN_HTML) 

if you want to add audit attributes in the HTML tags changed, pass the option in the surgeon service creation

surgeon = HtmlSurgeon.for(GIVEN_HTML, audit: true)

with the surgeon service, you can prepare several change sets. A change set is defined by a node set and a list of changes to be applied on each selected node.

change_set = surgeon.css('div.to-be-changed') # => will return a change_set

change_set.node_set # => will return a Nokogiri's Node Set with the selected nodes (right now it'll get us div ID 1 and div ID 2.

# to prepare a change replace the tag name 'div' for 'article'
change_set.replace_tag_name('article') 
    
# to prepare another change to add a css class in the selected nodes
change_set.add_css_class('added-class')

# we can add a second one
change_set.add_css_class('another-added-class') 

The changes are not made yet. In order to do it, we call run on the change set

change_set.run

surgeon.html # => html with the changes applied
# =>
# <div>
#     <h1>Something</h1>
#     <article id="1" class="lol to-be-changed added-class another-added-class">1</div>
#     <span>Other</span>
#     <article id="2" class="another to-be-changed added-class another-added-class">
#         <ul>
#             <li>1</li>
#             <li>2</li>
#         </ul>
#     </div>
# </div>


# original html still in
surgeon.given_html == GIVEN_HTML # => true

# you can review what was changed in the change set
change_set.changes
# =>
# [
#   "replace tag name with article",
#   "add css class added-class",
#   "add css class another-added-class",
# ]

You can also review what nodes were changed, or the count of them

change_set.run

change_set.changed_nodes # => array with nodes changed (without the skipped nodes)
change_set.changed_nodes_size # => same as change_set.changed_nodes.size

We can also chain call the changes in a changeset

surgeon = HtmlService.for(GIVEN_HTML)
surgeon.css('.lol').replace_tag_name('span').add_css_class('hey').run
surgeon.html # =>
# <div>
#     <h1>Something</h1>
#     <span id="1" class="lol to-be-changed hey">1</span>
#     <span>Other</span>
#     <div id="2" class="another to-be-changed">
#         <ul>
#             <li>1</li>
#             <li>2</li>
#         </ul>
#     </div>
# </div>

If we have enabled audit, we'll get the changes applied to an element in an data attribute. It will store, in JSON, an array with all the changes.

surgeon = HtmlService.for(GIVEN_HTML, audit: true)
surgeon.css('.lol').replace_tag_name('span').add_css_class('hey').run
surgeon.html # =>
# <div>
#     <h1>Something</h1>
#     <span id="1" class="lol to-be-changed hey" data-surgeon-audit='[{"change_set":"830e96dc-fa07-40ce-8968-ea5c55ec4b84","changed_at":"2015-07-02T12:52:43.874Z","type":"replace_tag_name","old":"div","new":"span"},{"change_set":"830e96dc-fa07-40ce-8968-ea5c55ec4b84","changed_at":"2015-07-02T12:52:43.874Z","type":"add_css_class","class":"hey"}]'>1</span>
#     <span>Other</span>
#     <div id="2" class="another to-be-changed">
#         <ul>
#             <li>1</li>
#             <li>2</li>
#         </ul>
#     </div>
# </div>

the attribute's value (formatted) is:

[
  {
    "change_set":"830e96dc-fa07-40ce-8968-ea5c55ec4b84",
    "changed_at":"2015-07-02T12:52:43.874Z",
    "type":"replace_tag_name",
    "old":"div",
    "new":"span"
  },
  {
    "change_set":"830e96dc-fa07-40ce-8968-ea5c55ec4b84",
    "changed_at":"2015-07-02T12:52:43.874Z",
    "type":"add_css_class",
    "class":"hey"
  }
]

it has a change_set with the ID of the change set, changed_at with the moment it was applied, and the rest define the change.

Selecting the Node Set

we use Nokogiri's selections.

using css

change_set = surgeon.css('div.to-be-changed')

using xpath

change_set = surgeon.xpath("span") # note that we use Nokogiri's HTML Fragment and the use of self is special.

Refining the selection

we can skip some nodes based on callbacks added to the Change Set using select and reject methods.

change_set = surgeon.css('.to-be-changed')
change_set.reject { |node| node.name == 'div' }.select { |node| node.get_attribute('class').to_s.split(' ').include? 'yeah-do-it' }
change_set.run # => nodes skipped if reject callback return truthy or if select callback return falsey 

Available Changes

Replace Tag Name

surgeon.css('div.to-be-changed').replace_tag_name('article')

Add CSS Class

surgeon.css('div.to-be-changed').add_css_class('applied-some-stuff')

Remove Attribute

surgeon.css('div.to-be-changed').remove_attribute('style')

Rollback

the surgeon can be used to revert any audited rollback. We can select what changes to rollback based on:

  • change_set: The change_set UUID
  • changed_at: The change timestamp
  • changed_from: All changes which timestamp is more recent than the given time

We can also revert all audited changes.

surgeon = HtmlSurgeon.for(GIVEN_HTML) 

surgeon.rollback # => Integer with number of changes reverted
surgeon.html # => returns the html with all events reverted 

surgeon.rollback(change_set: uuid) # => Integer with number of changes reverted
surgeon.html # => returns the html with only the given change set reverted

surgeon.rollback(changed_at: changed_at) # => Integer with number of changes reverted
surgeon.html  # => returns the html with only the change set with timestamp reverted

surgeon.rollback(changed_from: changed_from) # => Integer with number of changes reverted
surgeon.html # => returns the html with any change sets with a timestamp more recent than `changed_from` reverted 

Clear Audit trail

we can clear all audit from the given html with the clear_audit method.

surgeon = HtmlSurgeon.for(GIVEN_HTML)
surgeon.clear_audit # => returns an Integer with the number of changes
surgeon.html # => returns the html with all audit html attributes removed

Helper Methods

HtmlSurgeon.node_has_css_class?(nokogiri_node, css_class)

it will return true if the given nokogiri node has that css_class

Installation

Add this line to your application's Gemfile:

gem 'html_surgeon'

And then execute:

$ bundle

Or install it yourself as:

$ gem install html_surgeon

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake rspec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/eturino/html_surgeon.

CHANGESET

v0.7.0

  • added remove_attribute change.
  • changes will be skipped if not needed on that node (will not do anything, nor be added to the audit).

v0.6.0

  • WARNING: BREAKING API CHANGE: now rollback and clear_audit return the number of changes performed

v0.5.2

  • added changed_nodes and changed_nodes_size to Change Set

v0.5.1

  • works with nil html, performs a to_s to the given html on initialization.

v0.5.0

  • added node_has_css_class? helper method to HtmlSurgeon
  • added clear_audit to surgeon

v0.4.0

  • added select and reject callbacks to Change Set, based on blocks with node as single argument

v0.3.0

  • added fluid ChangeSet ID setter
  • added change_set xpath support

v0.2.0

  • added rollback support