An shell script to help translating HTML in LaTeX
...and another that extracts bare text from HTML

Version francaise / Deutsche Übersetzung


Description

  Html2LaTeX is a very simple translation script, using awk and sed. He should run on any Unix system. He knows only a quite restricted number of HTML tags, but its abilities can very simply and very quickly be extended. Its aim is only to save repetitive and boring work to people, who would have such a translation to do, by performing a first translation before human hand correction. However, it has the advantage to be very small and very easily modifiable, since it is composed of less than 5,000 bytes, and that it doesn't need to be compiled. It has been placed on the web, freely available, in the eventuality that it could help someone...

   Warning ! Don't be confused with the very well-known LaTeX2HTML, which is more elaborate and realizes the inverse operation ! You'll be able to find that one anywhere on the web.

Download the HTML2LaTeX (3765 bytes)

There is also...

   a shell script, very useful in my humble opinion, from which html2latex was written, html2txt (892 bytres), to extract bare text from HTML (it removes in particular all tags) for instance to be able to read documentation HTML pages with any text editor .

Usage

  html2latex simply takes the file given on the command line, or read standard input stdin if none is given, and sends the result to standard output stdout.

Sample

   Here is the result of the translation in LaTeX of this HTML page itself. LaTeX source files are available in this directory

English page #1 English page #2 English page #3 English page #4

  N.B. This software is distributed in the hope that it will be useful, but without any warranty .

Feedback

   If you improve the software, please don't hesitate to send me your modifications, it would be kind of you !


  You can write to me by e-mail at following address: Christophe.Deroulers @ ens.fr (Especially if you find some english mistakes in this page... It will be a pleasure for me to correct them.)

Back to the index - Back to the list of the homepages of the ENS