Conversion from (La)TeX to plain text#
The aim here is to emulate the Unix nroff
, which formats text as best it can for the screen, from the same input as the Unix typesetting program troff
.
Converting DVI to plain text is the basis of many of these techniques; sometimes the simple conversion provides a good enough response. Options are :
dvi2tty
(one of the earliest),crudetype
andcatdvi
, which is capable of generating Latin-1 (ISO 8859-1) or UTF-8 encoded output.Catdvi
was conceived as a replacement fordvi2tty
, but development seems to have stopped before the authors were willing to declare the work complete.
A common problem is the hyphenation that TeX inserts when typesetting something : since the output is inevitably viewed using fonts that don’t match the original, the hyphenation usually looks silly.
Ralph Droms provides a txt bundle of things in support of ASCII generation, but it doesn’t do a good job with tables and mathematics.
Another possibility is to use the LaTeX-to-ASCII conversion program, l2a, although this is really more of a dε‑TeXing program.
The canonical dε‑TeXing program is detex
, which removes all comments and control sequences from its input before writing it to its output. Its original purpose was to prepare input for a dumb spelling checker, and it’s only usable for preparing useful ASCII versions of a document in highly restricted circumstances.
Tex2mail
is slightly more than a dε‑TeXer — it’s a Perl script that converts TeX files into plain text files, expanding various mathematical symbols (sums, products, integrals, sub/superscripts, fractions, square roots, …) into « ASCII art that spreads over multiple lines if necessary. The result is more readable to human beings than the flat-style TeX code.
Another significant possibility is to use one of the HTML-generation solutions, and then to use a browser such as lynx
to dump the resulting HTML as plain text.