isodoc: Processor to generate HTML/Word from Metanorma XML
Purpose
This Gem converts documents in the Metanorma document model into HTML and Microsoft Word.
Usage
The Gem contains the subclasses Iso::HtmlWordConvert
(for HTML output) and IsoDoc::WordConvert
(for Word output). They are initialised with the following rendering parameters:
- i18nyaml
-
YAML file giving internationalisation equivalents for keywords in rendering output; see https://github.com/metanorma/metanorma-iso#document-attributes for further documentation
- bodyfont
-
Font for body text
- headerfont
-
Font for header text
- monospacefont
-
Font for monospace text
- titlefont
-
Font for document title text (currently used only in GB)
- script
-
The ISO 15924 code for the main script that the standard document is in; used to pick the default fonts for the document
- alt
-
Generate alternate rendering (currently used only in ISO)
- compliance
-
Generate alternate rendering (currently used only in GB)
- htmlstylesheet
-
Stylesheet for HTML output
- htmlcoverpage
-
Cover page for HTML output
- htmlintropage
-
Introductory page for HTML output
- scripts
-
Scripts page for HTML output
- scripts-pdf
-
Scripts page for HTML > PDF output
- wordstylesheet
-
Stylesheet for Word output
- standardstylesheet
-
Secondary stylesheet for Word output
- header
-
Header file for Word output
- wordcoverpage
-
Cover page for Word output
- wordintropage
-
Introductory page for Word output
- ulstyle
-
Style identifier in Word stylesheet for unordered lists
- olstyle
-
Style identifier in Word stylesheet for ordered list
- suppressheadingnumbers
-
Suppress heading numbers for clauses (does not apply to annexes)
The IsoDoc gem classes themselves are abstract (though their current implementation contains rendering specific to the ISO standard.) Subclasses of the Isodoc gem classes are specific to different standards, and are associated with templates and stylesheets speciific to the rendering of those standards. Subclasses also provide the default values for the rendering parameters above; they should be used only as overrides.
e.g.
IsoDoc::Convert::Iso.new(
bodyfont: "Zapf Chancery",
headerfont: "Comic Sans",
monospacefont: "Andale Mono",
alt: true,
script: "Hans",
i18nyaml: "i18n-en.yaml"
)
The conversion takes place with a convert
method, with three arguments: the filename to be used for the output (once its file type suffix is stripped), the XML document string to be converted (optional), and a "debug" argument (optional), which stops execution before the output file is generated. If the document string is nil, its contents are read in from the filename provided. So:
# generates test.html
IsoDoc::Iso::HtmlConvert.new({}).convert("test.xml")
# generates test.doc, with Chinese font defaults rather than Roman
IsoDoc::Iso::WordConvert.new({script: "Hans"}).convert("test.xml")
# generates test.html, based on file1.xml
IsoDoc::Iso::HtmlConvert.new({}).convert("test", File.read("file1.xml"))
# generates HTML output for the given input string, but does not save it to disk.
IsoDoc::Iso::HtmlConvert.new({}).convert("test", <<~"INPUT", true)
<iso-standard xmlns="http://riboseinc.com/isoxml">
<preface><foreword>
<note>
<p id="_f06fd0d1-a203-4f3d-a515-0bdba0f8d83f">These results are based on a
study carried out on three different types of kernel.</p>
</note>
</foreword></preface>
</iso-standard>
INPUT
Note
|
In the HTML stylesheets specific to standards, the Cover page and Intro page must be XHTML fragments, not HTML fragments. In particular, unlike Word HTML, all HTML attributes need to be quoted: <p class="MsoToc2"> , not <p class=MsoToc2> .
|
Converting Word output into “Native Word” (.docx
)
This gem relies on html2doc to generate Microsoft Word documents.
Please see this post-processing procedure to convert output into a native-docx
document.