No commit activity in last 3 years
No release in over 3 years
PertinentParser helps you compose HTML tags across existing tag boundaries.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

= 0.8.6
 Project Readme

PertinentParser¶ ↑

PertinentParser is a library for text transformation combination. It is of particular use when dealing with HTML. For the time being, it is incomplete.

Basically, PertinentParser describes a set of rules for the composition of text transformations.

Usage¶ ↑

PertinentParser applies HTML markup to text. It will break tags across existing tag boundaries if you use ‘wrap_in`.

require "pertinent_parser"
h = PertinentParser::html("A <i>sentence with</i> some markup.")
#=> "A sentence with some markup."
h.apply
#=> "A <i>sentence with</i> some markup."
h.wrap_in("<b>", "with some")
h.apply
#=> "A <i>sentence <b>with</b></i><b> some</b> markup."

Breaking existing tags across new boundaries using ‘wrap_out`.

h = PertinentParser::html("A <i>sentence with</i> some markup.")
h.wrap_out("<b>", "with some")
h.apply
#=> "A <i>sentence </i><b><i>with</i> some</b> markup."

Using the replace operator.

h = PertinentParser::html("A <i>sentence with</i> some markup.")
h.replace("haha totally", "with some")
h.apply
#=> "A <i>sentence haha</i> totally markup."

Combining with TactfulTokenizer to markup sentences in HTML¶ ↑

Say you were to generate from HTML using Markdown, and then you want to add annotations denoting the occurences of sentences.

require "pertinent_parser"
require "tactful_tokenizer"

m = TactfulTokenizer::Model.new; nil
h = PertinentParser::html("Here in the U.S. Senate we prefer to eat our friends. Is it easier that way? <i>Yes.</i> <i>Maybe</i>!")
# TactfulTokenizer plays best with text without HTML, which PertinentParser conveniently handles.
sentences = m.tokenize_text(h)
#=> ["Here in the U.S. Senate we prefer to eat our friends.", "Is it easier that way?", "Yes.", "Maybe!"]
sentences.each_with_index do |sentence, i|
    h.wrap_out("<div class='sentence' id='#{i+1}'>", sentence)
end
h.apply
#=> "<div class='sentence' id='1'>Here in the U.S. Senate we prefer to eat our friends.</div> <div class='sentence' id='2'>Is it easier that way?</div> <div class='sentence' id='3'><i>Yes.</i></div> <div class='sentence' id='4'><i>Maybe</i>!</div>"

Installation¶ ↑

gem install pertinent_parser

Author¶ ↑

Copyright © 2014 Matthew Bunday. All rights reserved. Released under the MIT/X11 license.