Discussion of PDF conversion in latest draft from Jason White on 1998-11-18 (w3c-wai-gl@w3.org from October to December 1998)

From: Jason White <jasonw@ariel.ucs.unimelb.EDU.AU>
Date: Wed, 18 Nov 1998 17:34:23 +1100 (AEDT)
To: WAI Markup Guidelines <w3c-wai-gl@w3.org>
Message-ID: <Pine.SUN.3.95.981118172148.7496D-100000@ariel.ucs.unimelb.EDU.AU>

The treatment of publicly available PDF conversion tools in the latest
draft of the guidelines has disturbing implications, in so far as it
suggests that simply extracting the text from a PDF file and making it
available as HTML, without any further editing, is sufficient to create an
adequately accessible version. It is my understanding that none of the
available PDF conversion systems can actually recognise the inherent
structure of the document; they certainly can not automatically generate
ALT text and descriptions of images that may be present in the original
file.

PDF conversion yields a minimally accessible document, at best. It would
then need substantial editing to introduce proper structural markup and to
provide textual equivalents to visual content, provide appropriate markup
of tables, etc.

The same comment also applies to conversion from postscript, RTF (except
where style sheets are used carefully), word processor formats, etc., in
which the structure is not preserved in the original format and needs to
be re-introduced manually. I am concerned that authors may think they can
run a straightforward conversion and thereby overcome the access problem.

Of course, the real solution is to create PDF, HTML etc., in parallel,
starting from a well marked up source file (in XML for example).

None of these comments is intended to detract from the high value of
Adobe's accessibility efforts and their demonstrated commitment in this
regard. PDF is just being cited as one among many examples.

Received on Wednesday, 18 November 1998 01:34:28 UTC