Thursday, July 31, 2008

HTML to PDF Roundup

The task at hand is to find a way to take an HTML page, containing images and tables, and convert it to PDF for basic reports on the web. The solution must only use memory and no filesystem writes, and cannot reference other executables. (i.e. Acrobat)

First order of business was to find a library that could convert HTML to XHTML. I tried NTidy, but it continually gave me a divide by zero exception. Eventually I settled on SgmlReader.

Next I needed to take the XHTML and convert it to PDF. I found 7 different options, and here are my results.

PDFizer:
Cost: Open Source
Pros: Free, source code available.
Cons: Couldn't get it to work. Also after reading some of the forums, I found that even if it did work it doesn't support tables yet.

Winnovative:
Cost: $350 Single - $750 Redistributable
Pros: Works. Has an accurate representation of the test HTML page. Seems to have a lot of flexibility.
Cons: Costs money. Only supports XHTML as an input.

ExpertPDF:
Cost: $350 Single - $750 Redistributable
Notes: I can't as yet determine a difference between ExpertPDF and Winnovative. They are made by the same company, but marketed differently. The Api's are slightly different, but from what I can tell they are identical in functionality.

PDF Metamorphosis:
Cost: $239 Single - $1140 - Site - $2490 Source Code
Pros: Worked. Source available.
Cons: Did not give an accurate representation of the HMTL. Fonts were too big and formatting was off. Seems to not support CSS.

Subsystems:
Cost: $599+
Pros: None
Cons: Couldn't get this one to work. Threw a file not found error.

Alt-Soft:
Cost: $1499 Single - $4499 Site - $7499 Redistributable
Pros: Acurate, Fastest Render, Supports other formats as inputs including Docx, Good tech support
Cons: Only rendered one page during the test. Tech support responded withing 12 hours of our email and they are currently looking into it.

Homebrew:
Cost: Time, as I wrote it today.
Pros: It only costs the time that it takes to make it. We have the source code.
Cons: Not working at this moment, but it's close. It's using XSL-FO to transform the XHTML to PDF. Probably a day or two away from being completed and tested.

So basically the most attractive choice at this juncture is Winnovative/ExpertPDF. They have the best price to performance measure and they're proven to work. If Alt-soft comes back with a fix to the single page problem, they would move up imensly as they support other formats and seem to be a more professional group.

Contributors