- From: Dave J Woolley <DJW@bts.co.uk>
- Date: Wed, 2 Aug 2000 20:29:47 +0100
- To: "'w3c-wai-ig@w3.org'" <w3c-wai-ig@w3.org>
> From: Waddell, Cynthia [SMTP:cynthia.waddell@ci.sj.ca.us] > > The problem with PDF at this time is that screenreaders are unable to read > the text as well as fill in the forms of PDF documents. This is the basis [DJW:] I think I really need longer than I can spare to cover this well, but, whilst I accept that a well designed HTML document will generally be better than current generation PDF, I think: - there is a tendency to compare HTML as Tim Berners-Lee intended it with PDF in real life; - to confuse tools and formats; - to not consider the nature of the source material and the clerical procedures associated with it. On the first, most people, including, I suspect, many people interested in communicating information, treat HTML as a WYSIWYG language. As its not, the result of trying to force it into that mould can be worse than the result of using a language that is intended for page layout. On tools versus formats. The PDF format is actually designed to make linear reading of text quite easy, and I doubt that much is needed to handle forms in a screen reader context. However, it is possible that the reading tools don't give adequate interfaces for screen readers, and it is certainly true that most of the authoring tools (i.e. standard word processors) make finding word boundaries difficult, by placing characters individually. (Even so, if you have material in PostScript or paper, converting the PostScript to PDF will produce a document with 100% correct character identification, whereas using OCR on the paper document will misread many characters.) As to source material. If you only have hard copy and you have an imperative to reproduce it accurately, you would use GIF with HTML in the contexts where you would use scanned material with PDF. PDF can actually do better by matching the OCRed text with the image. Basically, before worrying about HTML versus PDF, you must first convert from paper to machine readable documents and then to electronic submission. > Adobe has committed to finding ways to incorporate structure into a PDF > document upon creation and we all welcome that effort. The first website > [DJW:] The current PDF specification allows text to be annotated with structure information, however, as well as having the tools to create and use this, you also need people who can think other than WYSIWYG; they are very rare. You really only need to undo damage from printer driver microspacing in order to recover the contents of textual documents to the same quality as the average <font face...><br>++ flat HTML, that most people write. [DJW:] Basically, in many contexts, plain text is the most accessible format, in some you need features from HTML or other structural markup languages, and in some, PDF may be the most practical format, without expending a lot of skilled labour on reworking clerical procedures and marking up documents. Government funded bodies tend to have limited amounts of this; commercial bodies look for a return on investment that exceeds that they could get by using those resources in other ways; and people in between (like "agencies" in the UK sense) tend to work like commercial organisations, but minimising costs, rather than maximising profits. ++ Even with fully revisable WP documents, it is not unknown for professional people to tab round the end of line to get a new line. -- --------------------------- DISCLAIMER --------------------------------- Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of BTS.
Received on Wednesday, 2 August 2000 15:29:54 UTC