Project

libxml4r

0.01
No commit activity in last 3 years
No release in over 3 years
Libxml4r is a light set of methods and bolt-ons which aren't maintained by the core libxml ruby library. These methods aim to provide a more easy to use xml API. All libxml4r methods are mixed into the original LibXML::classes. (This gem was previously called libxml4r).
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

Runtime

>= 1.1.3
 Project Readme

libxml4r¶ ↑

libxml4r is a light set of methods and bolt-ons which aren’t maintained by the core libxml ruby library. These methods aim to provide a more convenient API interface which is provided and documented separately, but actually mixed in to extend the original LibXML::classes. Using these methods should significantly reduce the lines of code needed to perform the most common operations of accessing and manipulating an xml document structure.

For a full list of methods, please refer to the RDoc Documentation and also the Libxml-Ruby RDocs

For benkchmarks / performance comparison see cfis.savagexi.com/2008/07/16/resurrecting-libxml-ruby

Installation¶ ↑

gem sources -a gemcutter.org gem install libxml4r

Getting started¶ ↑

You can call to_xmldoc on any xml string to convert it into an XML::Document

>> s = ‘<foo id=“1”><author>p. bogle</author><bar>content</bar><bar>cont2</bar></foo>’ >> doc = s.to_xmldoc

The +node+ method returns the first Node matching the given xpath

>> doc.node => <author>p. bogle</author>

The +nodes+ method returns an array of Nodes matching the given xpath

>> doc.nodes => [<bar>content</bar>, <bar>content2</bar>]

nodes[] can be called with a block to iterate through each of the matching nodes

>> doc.nodes do |bar| puts bar.xpath; end /foo/bar /foo/bar

You can call node[] on another node to iterate and search within a smaller context of the document

>> foo = doc.node >> foo.node => <author>p. bogle</author> >> foo.nodes.each {|node| puts node.inner_xml} => “content2” => “cont2”

For more information about XPath syntax, please see www.w3schools.com/XPath/xpath_syntax.asp

Removing the whitespace nodes amongst our data nodes¶ ↑

::LibXML::XML.default_keep_blanks = false

Put this line at the beginning, before parsing the xml document. Othewise libxml interpolates all the ‘real’ data nodes with string nodes of the whitespace found between them. So ‘node.next` will point to the next data node, and not something like `“n ”`. This is the “right” thing to do, whenever expecting to walk along or iterate over the parsed doc tree.

Namespace Stripping¶ ↑

The handling of default namespaces in libxml-ruby is extremely awkward and cumbersome as it requires passing along an array of namespace strings with every find() method call. It also represents ambiguity concerning the href of the default namespace.

Suppose you had a namespaced XML source with xmlns:= directives like this >> document.to_xml <?xml version=“1.0” encoding=“UTF-8”?> <feed xmlns=“www.w3.org/2005/Atom” xmlns:openSearch=“a9.com/-/spec/opensearchrss/1.0/” xmlns:gContact=“schemas.google.com/contact/2008" xmlns:gd=”schemas.google.com/g/2005“> <title type="text">Phil Bogle’s Contacts</title> … </feed>

With libxml4r its possible to do:

>> document.strip!

And be left with plain xml which can be parsed more easily and without the need for specifying any confusing namespace directives

>> document.to_xml <?xml version=“1.0” encoding=“UTF-8”?> <feed> <title type="text">Phil Bogle’s Contacts</title> … </feed> >> document.node.inner_xml => “Phil Bogle’s Contacts”

After manipulating your data model, a default namespace can later be re-applied on the top-level node. However its generally not recommended to use more than one namespace within the same xml document.

A Complete example¶ ↑

Here we combine our usage of both the ‘libxml-ruby`, and `libxml4r` functions, which results in much smaller and more compact code. Its just a case of using the right tool for the right job.

::LibXML::XML.default_keep_blanks = false @string = File.read(@launchd_plist) @doc = @string.to_xmldoc.strip! @doc = doc.node @nodes_plist_keys = @doc.nodes @plist_keys_values = @nodes_xml_keys.collect {|n| [n.inner_xml, n.next] }

Final words¶ ↑

Its well worth it spending some time to understand the libxml-ruby api and xpath. Its well worth it to go back and revisit all of the links dotted around this page. Those documentation are absoloutely invaluable. 100% the best.

Notes on Patches/Pull Requests¶ ↑

  • Fork the project, and create a topic branch as per these instructions.

  • Make your feature addition or bug fix.

  • Include documentation for it.

  • Include a regression test for it. So I don’t break it in a future version unintentionally.

Copyright © 2009 Dreamcat4. See LICENSE for details.

Contribution, Credits¶ ↑

This project was started on the top of Phil Bogle’s libxml_helper.rb. Most methods have been renamed to represent the api styling of the libxml project. Some coding examples here are also adapted from Phil’s original explanatory texts.

Copyright © 2008 Phil Bogle. Copyright © 2009-2010 Dreamcat4.