Re: PDF accessibility and complex script languages. from Andrew Cunningham on 2016-01-04 (w3c-wai-ig@w3.org from January to March 2016)

From: Andrew Cunningham <andj.cunningham@gmail.com>
Date: Tue, 5 Jan 2016 08:10:05 +1100
To: Duff Johnson <duff@duff-johnson.com>
Cc: WAI Interest Group <w3c-wai-ig@w3.org>
Message-ID: <CAOUP6KnrPHhwYfq-gtVtBTQo84RXg_dbjfY7Eyfb+nAFMON0hg@mail.gmail.com>

On 5 Jan 2016 7:25 am, "Duff Johnson" <duff@duff-johnson.com> wrote:
>

>
> It’s typical for PDF consuming tools that extract or process text to use
ActualText. Software that cannot process ActualText cannot be claimed to
support accessible (i.e., tagged) PDF.
>
>

I need to do more testing i suspect.

Image with ActualText seems to work fine.

Text AND ActualText combination seems to need more testing.  I need to do
more testing of text extraction and searching. Even something as simple as
cutting and pasting from a PDF seems to be problematic.

When I have cut and pasted from a test PDF for instance the operation seems
to occur at the glyph/text level, and not the ActualText.

I guess i need to try alternative software. And map out what would work
best for each language.

Andrew

Received on Monday, 4 January 2016 21:10:32 UTC