W3C home > Mailing lists > Public > w3c-wai-ig@w3.org > July to September 1998

Re: PDF to HTML conversion

From: Kelly Ford <kford@teleport.com>
Date: Wed, 19 Aug 1998 21:57:45 -0700
Message-Id: <3.0.5.32.19980819215745.00806d00@mail.teleport.com>
To: w3c-wai-ig@w3.org
This is not meant as a complaint, rather a point of clarification.

While it is indeed true that Adobe offers assorted methods for converting
.pdf files into text, folks should not consider this a reliable method of
access in all settings.  I've used these various options since they were
released to the public and the results are quite mixed.  You can usually
get the text of the file in question, but the actual layout and
presentation is sometimes another story.

I used to routinely convert this little 8-page version of the New York
Times and something that simple had some odd quirks.  The headline for the
first story ended up being in the file after the end of the second story as
a simple example.

Many complicated documents such as catalogs, statistical tables and
investment documents are almost useless.

Yes I am happy there is this option but in my view it doesn't work with
reliability enough to be counted on.
Received on Thursday, 20 August 1998 00:51:17 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 19 July 2011 18:13:40 GMT