W3C home > Mailing lists > Public > w3c-wai-ig@w3.org > July to September 2001

Re: final report of the Web-Based Education Commission

From: David Woolley <david@djwhome.demon.co.uk>
Date: Thu, 9 Aug 2001 23:10:42 +0100 (BST)
Message-Id: <200108092210.f79MAgU02902@djwhome.demon.co.uk>
To: w3c-wai-ig@w3.org
> I've never understood the attraction of PDF.

The attraction to authors is that it produces repeatable presentation,
which is what designers want and try to force HTML to achieve (often
making unsafe assumptions in the process).  I'd rather designers were
honest with themselves and used a tool that matched what they were
trying to achieve, than forced HTML to be something other than HTML.
When they are not expending all their effort fighting the medium, they
might just spend some of it on accessibility.

> Sure they print out nicely compared to many other formats, but they
> are a pain to read online and the files are quite often coffee-files
> (A file which gives you time to make a fresh pot of coffee while it
> downloads).

The bloat is not the fault of PDF.  A file optimised for PDF should be
about the same size as a .zip of the equivalent web site (smaller if
commercial art has had to be converted to raster form for the web site).
I find PDF a handy format for large documents that I want to be able to
search in their entirety, and, until SVG (which could be considered a PDF
derivative!) is generally available, its the only reasonable portable
way of efficiently coding technical diagrams.  It's especially useful
for documents that are accessed offline.  The outline tree (bookmark)
display is also useful.

The bloat comes from two main sources:

- PDFs of old documents are essentially just PNGs of each page with 
  a small structuring layer on top - if you are lucky there will also be
  an OCRed text underlay;
- PDFs created from word processors reflect the bloat that is in the 
  PostScript from those word processors, e.g. modern versions of MSWord
  place each character individually, whereas optimal PDF would output a
  sentence at a time with a stretch factor - this also improves
  accessibility, as each line is plain text. 
Received on Thursday, 9 August 2001 18:19:36 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 13 October 2015 16:21:13 UTC