- From: T. V. Raman <raman@Adobe.COM>
- Date: Wed, 2 Sep 1998 10:32:28 -0700 (PDT)
- To: Robert Neff <rcn@fenix2.dol-esa.gov>
- Cc: "w3c-wai-ig@w3.org" <w3c-wai-ig@w3.org>, raman@Adobe.COM
Robert Neff writes: Robert-- In general PDF forms convert okay to the extent that you get the content in the conversion-- but the form fields are not made active in the HTML. The biggest nightmare scenario is one where a paper form is scanned in to create an electronic version where portions are subsequently made "active form fields". If the creator does not take the effort to ensure that the piece of paper is being OCR-ed correctly, there is little one can do later to make the form accessible. >From the various error reports we have received in over the last 18 months, the majority of PDF documents that convert poorly on some Federal WWW sites appear to have originated from an OCR-based workflow-- and things vary all over the map --from PDF files where in fact no OCR was done i.e. the page just has a scanned image -- to PDF files that were probably produced from scanned images that were not necessarily very clean-- in these cases you either see nothing in the output of the convertor (the convertor *does* *not* do OCR) or fairly poor quality text --the text is of course a direct function of the OCR results. These kinds of problems are particularly hard to explain to someone who is making a check-off decision on "is it accessible"-- in general this is a problem that the accessibility field faces all over the map. For a talk I put together on how to design accessible WWW publishing solutions using all of today's technologies including HTML, CSS and PDF (given the constraints of time, cost and user benefits) see http://cs.cornell.edu/home/raman/publications/talks/gsa-98/ The main point is to recognize what you point out below -- not all PDF files are created equal. PDF documents get created today via a number of creation paths --ranging from drawing applications to word-processors to document scanners --the mileage from the access conversion varies accordingly. On the one hand it's frustrating to run into a PDF document on the WWW that converts poorly and is consequently inaccessible-- but it is important to examine how those documents originated --rather than simply blaming it on the file format. If we walked up to the average Webmaster and said "dont published those scanned PDFs --but HTML is accessible to the blind" --the average Webmaster would simply create a bunch of scanned gifs and hook them into an HTML page. On the other hand, documents created from word processing applications do perform reliably when processed by the access convertor --the amount of usage the service gets bears witness to this fact. Finally, the purpose of the access service is not as some on this list have alleged a "attempt to pay lip service to accessibility" --it's a service that makes information accessible that would otherwise remain out of reach. For those with long memories, I originally raised the question of accessibility to final form content such as PDF in 1993-- at the time I was laughed off of several blindness related lists by blind users who made statements of the form "We called wordperfect and they said they would support it -- so it will be accessible to us" --and other similar foolish assertions. I came to Adobe in 1995 and worked on the access problem because I felt that even if I achieved 30% of what I achieved in the world of LaTeX documents (see http://cs.cornell.edu/home/raman for the work on audio formatting of structured documents) with PDF documents, I would have a greater impact on the amount of information that is accessible --simply because I would be making a 30% improvement to 80% of the world's documents --as opposed to a 90% improvement to about 1% of the world's documents. I still believe in this decision --otherwise I would not be doing what I am currently doing. At the time I started working on better access to PDF, I had also gotten completely disillusioned with the visual presentation slope down which HTML was sliding --look at the mess on the WWW today and you will see the realization of the nightmare I foresaw. All is not gloom and doom though -- the advent of XML and CSS and ACSS should help us recover some of the ground we lost. > This is a reply to Kelly: > > I agree. Not all files are seamlessly converted. You need to be aware > what you are converting. If it is a document with text, then it should be > ok. However if it is a form, it may not convert well. You should test > before you post! > > Does anyone know of PDF Form (where you can enter data)? Would like to > see how these convert. > > > -----Original Message----- > From: Kelly Ford [SMTP:kford@teleport.com] > Sent: Tuesday, September 01, 1998 7:57 PM > To: w3c-wai-ig@w3.org > Subject: RE: Re: Adobe And TRACE Launch Enhanced PDF Access Via Email > > I don't want to start a debate here but my personal opinion is that all > thse converters do as much harm as good. If you ask me pdf is a > problematic file format at best for people that are blind. I recently > tried to convert a batch of IRS documents and the results were disasterous. > Yet the person at the IRS knew all about these converters and pointed me > directly to them and seemed oh so very pleased that the documents would be > made accessible by this convert technology. -- Best Regards, --raman Adobe Systems Tel: 1 (408) 536 3945 (W14-612) Advanced Technology Group Fax: 1 (408) 537 4042 (W14 129) 345 Park Avenue Email: raman@adobe.com San Jose , CA 95110 -2704 Email: raman@cs.cornell.edu http://labrador.corp.adobe.com/~raman/ (Adobe Intranet) http://cs.cornell.edu/home/raman/raman.html (Cornell) ---------------------------------------------------------------------- Disclaimer: The opinions expressed are my own and in no way should be taken as representative of my employer, Adobe Systems Inc. ____________________________________________________________
Received on Wednesday, 2 September 1998 13:31:33 UTC