W3C home > Mailing lists > Public > www-html@w3.org > February 2002

Re: html to pdf conversion

From: Christian Wolfgang Hujer <Christian.Hujer@itcqis.com>
Date: Tue, 12 Feb 2002 11:30:39 +0100
To: phantom kr <me_the_phantom@yahoo.com>, www-html@w3c.org
Message-ID: <16aaG5-2AbiL2C@fmrl02.sul.t-online.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

Am Dienstag, 12. Februar 2002 07:06 schrieb phantom kr:
> where can i find source code for converting html files
> to pdf format, source code being in java.

several tools exist for that purpose.

I searched in google for "HTML Java PDF Conversion".

I found iText, which is an open source PDF library in Java that already has 
capabilities of converting HTML to PDF.
On
http://www.lowagie.com/iText/links.html
are links to several other PDF engines in Java.
iText is an open source project hosted on sourceforge.

But the "modern art of HTML2PDF conversion" is the 
following:
1. Make sure the HTML files are valid XHTML (or at least well-formed XML).
2. Use a transformation stylesheet that transforms HTML to XSL Formatting 
Objects using XSL Transformation. Apply that stylesheet using an XSLT 
processor like xt, xalan, saxon...
3. Run a Formatting Objects engine (like FOP from James Tauber / Apache 
Group) that converts the generated FO-Tree from step 2 to PDF.

Most XSLT processors and most Formatting Objects engines are written in Java, 
including all mentioned products (xt, xalan, saxon, fop) and come with their 
source code.

The following points must be kept in mind:
- - Knowledge
XSLT and XSL:FO are 1-4 new languages to learn (depends on the point of view 
and your knowledge, my opinion is that it's four languages altogether: XML, 
XPath, XSLT and XSL:FO, but these are quite easy languages)
- - Development speed
XSLT Stylesheets and XSL Formatting are quite easy to learn and very quick to 
develop.
It takes only few time to write an XSLT stylesheet.
- - Servlet Usage
XSLT and XSL:FO are usable as servlets.
- - Performance
Java native library performance might be considerably better.
- - The XSLT and XSL:FO is highly configurable. You control nearly every pixel 
(resp. point) of the resulting PDF.

But I've never used the non-XSL:FO way, so I can't say much about that.


Just my 2 cents.

Greetings

- -- 
Christian Wolfgang Hujer
Geschäftsführender Gesellschafter
ITCQIS GmbH
Telefon: +49 (089) 27 37 04 37
Telefax: +49 (089) 27 37 04 39
E-Mail: mailto:Christian.Hujer@itcqis.com
WWW: http://www.itcqis.com/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE8aO7TGU/Ex9kzkZ4RAp1JAJwMl2KhQtvVxUymU+GR5XUwNIKtPwCfeGU7
6o5QGx7tV8cdpcSJp73fiGk=
=/vwd
-----END PGP SIGNATURE-----
Received on Tuesday, 12 February 2002 05:34:23 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:15:50 GMT