W3C home > Mailing lists > Public > w3c-wai-ig@w3.org > January to March 2016

Re: PDF accessibility and complex script languages.

From: Olaf Drümmer <olaflist@callassoftware.com>
Date: Wed, 6 Jan 2016 01:58:46 +0100
Cc: Olaf Drümmer <olaflist@callassoftware.com>, Andrew Cunningham <andj.cunningham@gmail.com>, w3c WAI List <w3c-wai-ig@w3.org>
Message-Id: <BB1F873D-6FAD-4D27-A04E-30CAB11B5174@callassoftware.com>
To: Andrew Kirkpatrick <akirkpat@adobe.com>
Hi Andrew,

> On 05.01.2016, at 02:41, Andrew Kirkpatrick <akirkpat@adobe.com <mailto:akirkpat@adobe.com>> wrote:
> ActualText is NEVER EVER EVER used to represent the results of OCR – that would be a violation of the standard.

I’d love to learn how you come to this statement. AFAICT it can’t be derived from any of the PDF standards I know.

Just to make an extreme point: if each and every (recognized) character in an OCR-ed document would be represented by an ActualText attribute, on the formal level of applicable standards (PDF per ISO 32000-1 or PDF/UA per ISO 14289-1), nothing would be in violation of the applicable standards. Whether using such an approach makes any sense is a completely different story.

Received on Wednesday, 6 January 2016 00:59:12 UTC

This archive was generated by hypermail 2.3.1 : Friday, 29 January 2016 16:39:04 UTC