TeXLogParser
This small Ruby gem eases many pains around digesting logs from (La)TeX engines. Used as a command-line program or library, it converts (La)TeX logs into human- or machine-readable forms.
Disclaimer: Due to the nature of (La)TeX logs, parsing is inherently heuristic.
Installation
On any system with working Ruby (≥ 2.3), installation is as simple as this:
[sudo] gem install tex_log_parser
The usual options and, later, update mechanisms of Rubygems apply; please refer to their documentation for details.
Usage
There are two ways to parse logs: with the command-line program and via the underlying Ruby API.
Command-line Interface
By default, texlogparser
reads from stdin and writes to stdout. That is, you can use it like so:
pdflatex -interaction=nonstopmode example.tex | texlogparser
This adds so little runtime overhead that there are few reasons not to use it.
Note that the original log file will still be written to example.log
,
so no information is lost.
Important: Without nonstopmode
, pdflatex
et al. stop on errors to interact
with the user; texlogparser
is not prepared to play the middle man for that and
will block.
You can also read from and/or write to files:
texlogparser -i example.log # From file, to stdout
texlogparser -i example.log -o example.simple.log # From and to file
cat example.log | texlogparser -o example.simple.log # From stdin, to file
If you want to use the output programmatically, you may want to add option -f json
.
It does just what it sounds like.
Ruby API
The interface is rather narrow; your main entry point is class
TexLogParser.
Calling parse
on it will yield a list of
Message
objects.
Here is a minimal yet complete example:
require 'tex_log_parser'
log = File.readlines('example.log')
parser = TexLogParser.new(log)
puts parser.parse[0]
Recommendations
Here are some tips on how to generate logs that do not trip up parsing unnecessarily:
- Use
_latex
option-file-line-error
to get higher accuracy regarding source files and lines. - Increase the maximum line length as much as possible to improve overall efficacy. Bad linebreaks are bad.
- Avoid parentheses and whitespace in file paths.
- The shell output of the initial run of
pdflatex
et al. on a new file can contain output of subprograms, and be complicated in other ways as well. It is therefore more robust to use the log file as written to disk, and/or the output resp. log file produced by a subsequent run. (Don't worry, real errors will stick around!)
Contributing
For bug reports and feature requests, the usual rules apply: search for existing issues; join the discussion or create a new one; be specific and nice; expect nothing.
That aside, there are two groups of experts whose help would be much appreciated: (La)TeX gourmets and Ruby developers.
TeXians
Please report any logs that get parsed wrong, be it because whole messages are not found, or because not all details are correctly extracted.
Reports that provide the following information will be the most useful:
- Full failing log of a minimal example (ideally with source document).
- The engine(s) you use, e.g.
pdflatex
,xelatex
, orlualatex
. - Expected number of error, warning, and info messages (the latter optional).
- Expected message with
- log line numbers (where the message starts and ends),
- level of the message (error, warning, or info), and
- which source file (and lines) it references.
-
Advanced: In case of wrong source files, run
texlogparser -d
on the log and note on which lines it changes file scopes in wrong ways.
If you also know a little Ruby, please consider translating those data into a (failing) test and open a pull request.
Some preemptive notes:
- Issues around messages below warning level have low priority.
- Problems caused by inopportune linebreaks are probably out of scope.
Bonus: Convince as many package maintainers to use the same standardized, robust way of writing to the log.
Rubyists
Any feedback about the code quality and usefulness of the documentation would be very appreciated. Particular areas of interest include:
- Is the API designed in useful ways?
- Does the documentation cover all your questions?
- Is the Gem structured properly?
- What can be improved to encourage code contributions?
- Does the CLI script have problems on any platform?
Contributors
- egreg and David Carlisle provided helpful test cases and insight in LaTeX Stack Exchange chat.