W3C home > Mailing lists > Public > w3c-wai-ig@w3.org > January to March 2012

Re: Removing PDFs and accessibility

From: David Woolley <forums@david-woolley.me.uk>
Date: Mon, 26 Mar 2012 23:17:12 +0100
Message-ID: <4F70EAE8.1040704@david-woolley.me.uk>
To: "w3c-wai-ig@w3.org" <w3c-wai-ig@w3.org>
Ozi, Selim wrote:

> Can Adobe, allow the user to  choose which PDF format to create /save/view? 
> Something like below:
> 1- PDF / I  = image PDF
> 2- PDF/ O  = OCR
> 3- PDF/ TR = Tagged/touchup reading ordered
> 

Generally I would only expect image and OCR to be used when an important 
document did not exist in machine readable form.  OCR requires more 
expensive tooling and more man hours than image.  Denying the use of 
these formats is unlikely to result in more accessible forms of the 
documents, but rather in their not being available online at all.

A precondition for tagged PDF is that the original document was written 
by someone who understood how to do proper semantic markup (a skill 
lacking in many HTML authors).  In practice it also also requires that 
revisable form be an MS Word or HTML document and that the person 
creating the PDF was prepared to bay for the Adobe PDF authoring tools, 
rather than using free alternatives.

Any heuristic tagging of PDF could also be done by the assistive technology.

(Image might sometimes be used to try to protect intellectual property.)

In my experience it is very rare to find an image or OCR document when 
the revisable form is machine readable and available.  (What is rather 
more common is for graphic to embedded as DCT (~JPEG) form when they 
would be better in vector format, but that error happens before the 
document is ever presented to the PDF authoring tools.)

-- 
David Woolley
Emails are not formal business letters, whatever businesses may want.
RFC1855 says there should be an address here, but, in a world of spam,
that is no longer good advice, as archive address hiding may not work.
Received on Monday, 26 March 2012 22:17:40 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 26 March 2012 22:17:43 GMT