W3C home > Mailing lists > Public > w3c-wai-ig@w3.org > January to March 2012

Re: Converting html to pdf with java

From: Michael A. Peters <mpeters@domblogger.net>
Date: Sat, 11 Feb 2012 14:03:20 -0800
Message-ID: <4F36E5A8.2030800@domblogger.net>
To: w3c-wai-ig@w3.org
I have to agree. I think web pages to pdf is on it's way out. The
typesetting usually ends up being crap anyway, unless your source is
something like LaTeX and is using tex4ht to generate the web page.

I believe for off-line viewing of content, EPUB3 is going to quickly
replace PDF. EPUB3 uses HTML5 internally, supports some JavaScript
(wouldn't expect ajax etc. to work), supports html5 multimedia tags, and
is not that difficult to generate from a web page.

There are many readers that can display ePub though I don't know how
many currently handle ePub3. You can though provide fallbacks for many
features so that ePub2 readers can handle it.

The FireFox extension for ePub does seem to handle ePub3.

I expect WordPress plugins etc. that do it to show up shortly if they
don't already exist.

Anyway, converting content to ePub3 is what I'm currently coding for my
projects (not quite done yet), including backwards support for ePub2
readers where possible. When done, my class (writing it in php) will be
MIT (Expat) and released on phpclasses.org but it will largely be
intended for content already in a sane HTML5 article structure, I expect
others though will have plugins for your favorite CMS sooner than I'm done.

That's my thought, forget html to pdf, go html to epub - and with Java's
libxml2 bindings it shouldn't be that difficult to do in Java.

On 02/10/2012 06:36 PM, Adam Cooper wrote:
> Hi George,
>
> The question is not whether it is possible, but why would you bother.
> The majority of PDFs on the web are derived from electronic source
> documents, do not utilise the content security features offered by the
> platform, and creating accessible PDFs is time-consuming, incurs time
> and monetary costs, and is beyond the skill level of most casual content
> creators, so I struggle to find compelling reasons why there is a need
> to use PDF at all, especially when there are tools and methods in
> existing non-proprietary technologies such as (X)HTML, CSS, and JS etc.
> which offer comparative content securing and (print) formatting
> functionality.
>
> Cheers,
>
> Adam
Received on Saturday, 11 February 2012 22:03:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 11 February 2012 22:03:49 GMT