html2latex
Navigation
Home
Package
Documentation
Download
Screenshots
Feedback

html2latex

Summary

html2latex is a Perl script designed to convert a properly formatted HTML file into a properly formatted LaTeX file.

News

Version 1.0 is out. It basically is a installation fix for 0.9, but it also adds the 'kill' tag type which allows you do such things as remove any javascript. 'make test' failed in 0.9, which could be a major headache for some people. Version 0.9 is a minor release that supports international characters, quote-expansion, plus a fex bug fixes. You can dowload the latest tar.gz here. If you already got 0.9 installed and aren't bothered by javascript, you don't have to bother with 1.0; it's just the same.

Supports

  1. It can handle URLs on the command line and in the IMG tag.
  2. Converts pictures from jpeg or gif to png. pdflatex can have included pngs.
  3. Renders nested tables correctly.
  4. Supports most international characters (umlats, accents, etc).
  5. Converts all headers into sections. This can be easily customized.
  6. Lists of any form.
  7. Endless configuration thourgh command-line options or an XML config file.
  8. It is also very easy to extend by writing your own handlers.

Feedback

If you try out the software, please go to the feedback site and take the survey. Or you can put comments in the forum, or email me. I'd like your suggestions.

Site Directions

  1. Home - Here
  2. Package - Link to unzipped files of latest files. Look here for ChangeLog, TODO, README, etc.
  3. Documentation - Right now, a man page.
  4. Download - Sight listing all releases.
  5. Screenshots - Take a look at what html2latex can do.
  6. Feedback - Please, fill out a survey and tell me what you think.

Requirements

All required modules listed below and all of their dependencies can be found here

html2latex requires the following modules for basic operation:

  1. HTML::Tree - It requires HTML::Parser.
  2. XML::Simple - It requires XML::Parser.
html2latex can use the following moduls for advanced operation:
  1. LWP::Simple - Used do download URLs. Requires lots of things; look for Bundle::LWP or libwww.
  2. URI - Comes with libwww or Bundle::LWP. Also required to grab URLs.
  3. Image::Magick - If you want to convert images to PNGs.

The easiest way to get these modules is to use the CPAN module. Try man CPAN.