RPath
"Don't use this." —flavorjones [1]
Overview
RPath lets you query graphs, such as XML documents, with just Ruby.
RPath can operate on Nokogiri documents, REXML documents, Oga documents, and the filesystem. Building adapters for other graphs is simple.
Leading members of the Ruby community have warned against RPath's approach. They're probably right! RPath is as much an experiment as a useful tool.
Documentation
This README provides an overview of RPath. Full documentation is available at rubydoc.info.
Installation
gem install rpath
Example
Suppose we want the value of the name
attribute in the following XML document:
xml = Nokogiri::XML <<end
<places>
<place name="Green-Wood"/>
</places>
end
First we tell RPath we'll be using Nokgiri:
RPath.use :nokogiri
Then we create an RPath expression ...
exp = RPath { places.place[:name] }
... and evaluate it on the document:
exp.eval(xml) # => "Green-Wood"
If we only plan to use the expression once, we can pass the graph to RPath
. RPath evaluates the expression and returns the result:
RPath(xml) { places.place[:name] } # => "Green-Wood"
Some adapters, such as the built-in Nokogiri adapter, may add convenience methods that make the syntax even prettier:
xml.rpath { places.place[:name] } # => "Green-Wood"
The Graph Model
In an RPath graph,
- There is an initial vertex (a "root"),
- Each vertex has a name,
- Each vertex has zero or more adjacent vertices,
- Each vertex has zero or more named attributes, and
- Each vertex may have associated data called "content."
Adapters implement this abstraction for a particular type of graph. RPath can operate on any graph for which there is an adapter.
Expressions
An RPath expression, given a graph, selects a value—a vertex, a vertex array, the value of an attribute, or a vertex's content. RPath expressions are constructed by chaining methods inside the block passed to RPath
.
Selecting Vertices
All vertices named "foo" adjacent to the root:
RPath { foo }
The first "foo" adjacent to the root:
RPath { foo[0] }
All vertices named "bar" adjacent to the first "foo":
RPath { foo[0].bar }
Or, more succinctly (the first "foo" is assumed if the indexer is omitted):
RPath { foo.bar }
All vertices adjacent to the first "foo":
RPath { foo.adjacent }
All vertices adjacent to the first "foo" named "adjacent" (#named
lets us avoid collisions with built-in methods):
RPath { foo.adjacent.named("adjacent") }
All "foos" with attribute "baz" equal to "qux":
RPath { foo.where(baz: 'qux') }
Or simply:
RPath { foo[baz: 'qux'] }
And finally, all "foos" meeting arbitrary criteria:
RPath { foo.where { |vertex| some_predicate?(vertex) } }
Selecting Attributes
Attribute values are selected by passing a string to #[]
:
# The "baz" attribute of the first vertex named "foo" adjacent to the root
RPath { foo['baz'] }
Selecting Content
A vertex's content is selected with #content
:
# The content of the first vertex named "foo" adjacent to the root
RPath { foo.content }
Adapters
Nokogiri
The Nokogiri adapter exposes XML elements as vertices and child elements as adjacent vertices:
RPath.use :nokogiri
xml = Nokogiri::XML <<end
<foo>
<bar baz="qux">Hello, RPath</bar>
</foo>
end
RPath(xml) { foo.bar[0] } # => #<Nokogiri::XML::Element ... >
XML attributes become RPath attributes:
RPath(xml) { foo.bar['baz'] } # => "qux"
And text content is accessible with #content
:
RPath(xml) { foo.bar.content } # => "Hello, RPath"
An expression may be evaluated not just on an XML document but any Nokogiri::XML::Node
. Non-element nodes such as processing instructions, alas, are not accessible.
Finally, the convenience method #rpath
, added to Nokogiri::XML::Node
, allows for more compact syntax:
xml.rpath { foo.bar.content } # => "Hello, RPath"
REXML
The REXML adapter is similar to the Nokogiri one. Expressions may be evaluated on any REXML::Element
.
RPath.use :rexml
xml = REXML::Document.new('<foo bar="baz"/>')
xml.rpath { foo['bar'] } # => "baz"
Oga
RPath expressions may be evaluated on an Oga::XML::Document
or an Oga::XML::Element
.
RPath.use :oga
xml = Oga.parse_xml('<foo bar="baz"/>')
xml.rpath { foo['bar'] } # => "baz"
Filesystem
The filesystem adapter exposes files and directories as vertices. Directory entries are adjacent to their directory. Expressions may be evaluated on any directory:
RPath.use :filesystem
# Note that we must specify the adapter because RPath can't infer it from '~'
RPath('~', :filesystem) { where { |f| f =~ /bash/ } } # => ["~/.bash_history", "~/.bash_profile"]
Many file properties become RPath attributes:
RPath('/', :filesystem) { etc.hostname[:mtime] } # => 2014-12-17 14:43:24 -0500
And file contents are accessible with #content
:
RPath('/', :filesystem) { etc.hostname.content } # => "jbook"
Custom Adapters
Custom adapters allow RPath expressions to operate on new types of graphs. To create a custom adapter, subclass RPath::Adapter
and implement the abstract methods #adjacent
, #attribute
, #content
, and #name
. See the implementations in RPath::Adapters
for examples.
Once you've implemented a custom adapter, pass an instance to #RPath
:
RPath(graph, CustomAdapter.new) { foo.bar }
To avoid creating an instance for every evaluation, register the adapter and pass the underscored, symbolized class to RPath
:
RPath.use CustomAdapter.new
RPath(graph, :custom_adapter) { foo.bar }
If that's too long, pass a custom ID to RPath.use
:
RPath.use CustomAdapter.new, :custom
RPath(graph, :custom) { foo.bar }
Or, to avoid specifying the adapter altogether—as the built-in XML adapters do—implement #adapts?
in your adapter class:
class CustomAdapter < RPath::Adapter
def adapts?(graph)
graph.is_a? CustomGraph
end
# ...
end
Now RPath will select a registered CustomAdapter
when an expression is evaluated on a CustomGraph
:
RPath.use CustomAdapter.new
RPath(CustomGraph.new) { foo.bar }
Contributing
Please submit issues and pull requests to jonahb/rpath on GitHub.