EPUB Parser
INSTALLATION
gem install epub-parser
USAGE
As a library
require 'epub/parser'
book = EPUB::Parser.parse('book.epub')
book.metadata.titles # => Array of EPUB::Publication::Package::Metadata::Title. Main title, subtitle, etc...
book.metadata.title # => Title string including all titles
book.metadata.creators # => Creators(authors)
book.each_page_on_spine do |page|
page.media_type # => "application/xhtml+xml"
page.entry_name # => "OPS/nav.xhtml" entry name in EPUB package(zip archive)
page.read # => raw content document
page.content_document.nokogiri # => Nokogiri::XML::Document. The same to Nokogiri.XML(page.read)
# do something more
# :
end
See document's {file:docs/Home.markdown} or API Documentation for more info.
epubinfo
command-line tool
epubinfo
tool extracts and shows the metadata of specified EPUB book.
$ epubinfo ~/Documebts/Books/build_awesome_command_line_applications_in_ruby.epub
Title: Build Awesome Command-Line Applications in Ruby (for KITAITI MAKOTO)
Identifiers: 978-1-934356-91-3
Titles: Build Awesome Command-Line Applications in Ruby (for KITAITI MAKOTO)
Languages: en
Contributors:
Coverages:
Creators: David Bryant Copeland
Dates:
Descriptions:
Formats:
Publishers: The Pragmatic Bookshelf, LLC (338304)
Relations:
Rights: Copyright © 2012 Pragmatic Programmers, LLC
Sources:
Subjects: Pragmatic Bookshelf
Types:
Unique identifier: 978-1-934356-91-3
Epub version: 2.0
See {file:docs/Epubinfo} for more info.
epub-open
command-line tool
epub-open
tool provides interactive shell(IRB) which helps you research about EPUB book.
epub-open path/to/book.epub
IRB starts. self
becomes the EPUB book and can access to methods of EPUB
.
title
=> "Title of the book"
metadata.creators
=> [Author 1, Author2, ...]
resources.first.properties
=> #<Set: {"nav"}> # You know that first resource of this book is nav document
nav = resources.first
=> ...
nav.href
=> #<Addressable::URI:0x15ce350 URI:nav.xhtml>
nav.media_type
=> "application/xhtml+xml"
puts nav.read
<?xml version="1.0"?>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
:
:
:
</html>
=> nil
exit # Enter "exit" when exit the session
See {file:docs/EpubOpen} for more info.
REQUIREMENTS
- Ruby 1.9.3 or later
- C compiler to compile Zip/Ruby and Nokogiri
Related Gems
- gepub - a generic EPUB library for Ruby
- epubinfo - Extracts metadata information from EPUB files. Supports EPUB2 and EPUB3 formats.
- ReVIEW - ReVIEW is a easy-to-use digital publishing system for books and ebooks.
- epzip - epzip is EPUB packing tool. It's just only doing 'zip.' :)
- eeepub - EeePub is a Ruby ePub generator
- epub-maker - This library supports making and editing EPUB books based on this EPUB Parser library
If you find other gems, please tell me or request a pull request.
RECENT CHANGES
0.1.6
- Remove
EPUB.parse
method - Remove
EPUB::Publication::Package::Metadata#to_hash
- Add
EPUB::Publication::Package::Metadata::Identifier
- Remove
MethodDecorators::Deprecated
- Make
EPUB::Parser::OCF::CONTAINER_FILE
and other constants deprecated - Make
EPUB::Publication::Package::Metadata::Link#rel
aSet
- Add exception class
EPUB::Constants::MediaType::UnsupportedMediaType
- Make
EPUB::Constants::MediaType::UnsupportedError
deprecated - Add
EPUB::Publication::Package::Item#find_item_by_relative_iri
- Add
EPUB::Publication::Package::Item#cover_image?
- Add
EPUB::Book::Features
module and move methods ofEPUB
module to it.(Thanks, takahashim!) - Make including
EPUB
deprecated - Parse
hidden
attribute ofnav
elements - [Experimental]Add
EPUB::ContentDocument::Navigation::Item#traverse
0.1.5
- Add
ContentDocument::XHTML#title
- Add
Manifest::Item#xhtml?
- Add
--words
and--char
options toepubinfo
command - API change:
OCF::Container::Rootfile#full_path
became Addressable::URI object rather thanString
- Add
ContentDocument::XHTML#rexml
and#nokogiri
- Inspect more readably
0.1.4
- Fixed-Layout Documents support
- Define
ContentDocument::XHTML#top_level?
- Define
Spine::Itemref#page_spread
and#page_spread=
- Define some utility methods around
Manifest::Item
andSpine::Itemref
See {file:CHANGELOG.markdown} for older changelogs and details.
TODOS
- EPUB 3.0.1
- Multiple rootfiles
- Help features for
epub-open
tool - Vocabulary Association Mechanisms
- Implementing navigation document and so on
- Media Overlays
- Content Document
- Digital Signature
- Using SAX on parsing
- Extracting and organizing common behavior from some classes to modules
- Abstraction of XML parser(making it possible to use REXML, standard bundled XML library of Ruby)
- Handle with encodings other than UTF-8
DONE
- Simple inspect for
epub-open
tool - Using zip library instead of
unzip
command, which has security issue - Modify methods around fallback to see
bindings
element in the package - Content Document(only for Navigation Documents)
- Fixed Layout
- Vocabulary Association Mechanisms(only for itemref)
LICENSE
This library is distribuetd under the term of the MIT License. See MIT-LICENSE file for more info.