Re: PDF math accessibility continues move toward ISO standardization

In PDF, accessibility is mainly achieved via tagging the PDF so that
sections, lists, tables, etc., can be known to assistive technology (AT).
This tagging is done via a "structure tree" that is separate, but points to,
the actually content tree.  The ISO proposal adds the MathML presentation
tags (being PDF, it is presentation) to the set of predefined tags in PDF.
This means that not only is the math marked as "math" (actually "formula" to
reuse an existing tag), but the substructure is marked so that it can be
synchronously highlight and be navigated.  In PDF, tagging of math or
anything else is optional.

Separately, the PDF/UA committee is working on a document that describes
what an accessible PDF document looks like.  In that proposal (which will be
presented by the US NISO delegation at an ISO  meeting in Hamburg), tagging
is required.  That includes tagging the math.

Neither the PDF spec (ISO 3200-1) nor the PDF/UA proposal make requirements
on the user agents beyond following the spec.  Actually, that's a bit of a
lie since the PDF/UA proposal has a section on "conforming readers" which
basically says they must do things in an accessible manner, have accessible
user interfaces, etc.  None of this specifies copy/paste abilities other
than if such an ability is present, it must be accessible (eg, have keyboard
methods for doing the selection and copy).

There are several members from Adobe on the PDF/UA committee and Adobe has
publically stated that they want Acrobat and Adobe Reader to be as
accessible as they can be.  How the PDF/UA proposal will play out with
Preview or other PDF viewers is unknown.  Governments have mandates for
accessibility and it is possible they could adopt whatever spec ISO approves
regarding accessibility... or they may not.  There is no requirement that I
know of that mandates that browsers be accessible, yet most browser makers
have worked hard make their product accessible.  I certainly hope the same
holds true for PDF viewers.

I'd really love to see pdftex produce accessible TeX.  Postings on that
group have said that tagging (in general) PDF from TeX is hard.  I'm sure it
has something to do with the structure being gone by the time the PDF
generater gets to it.  There are tools that get MathML from the TeX, but
getting them to work with pdftex is not something that anyone has done yet,
as far as I know.

There are lots of tricks one can play with PDF (it is scriptable), but those
don't always mean that the PDF is accessible.  Two years ago, we
demonstarted a plug-in for Acrobat that essentially brings to it MathPlayer
capabilities.  That includes the ability to copy the MathML for pasting
elsewhere.  However, it's not useful because authoring tools such as pdftex
don't tag PDF with MathML (hence, no content for it to work with).

It would be great if wizards such as Ross Moore would apply their incredible
talents towards adding MathML tags to the generated PDF, and more generally,
tag all of the PDF so that it is accessible.  Ross and I exchanged some
thoughts on this in the tug mail list a year ago (
www.tug.org/mail-archives/pdftex/2007-November/007434.html).  There is also
another thread about an experiment for tagged PDF from pdftex (
www.tug.org/pipermail/pdftex/2008-April/007629.html).  I haven't seen follow
up, so I suspect the experiment was not successful :-(

Neil Soiffer
Senior Scientist
Design Science, Inc.
www.dessci.com
~ Makers of Equation Editor, MathType, MathPlayer and MathFlow ~


On Fri, Dec 26, 2008 at 12:21 PM, Paul Libbrecht <paul@activemath.org>wrote:

>
> On Planet MathML (http://www.w3.org/Math/planet/) I found the very
> interesting following article:
>
>> PDF math accessibility continues move toward ISO standardization
>> Planet MathML - W3C 18/11/08 00:13 Neil Soiffer
>> http://accessiblemath.dessci.com/
>> A few weeks ago, I wrote about the first step towards making PDF math
>> accessibility an ISO standard.  I said that the international ISO meeting in
>> Beijing was going to consider a proposal for including MathML tags into PDF
>> (officially known as ISO 32000).  This was a proposal that Design Science
>> made to the PDF/UA committee, who approved it and sent it to the US ISO
>> committee who also approved it.
>>
>> The US ISO committee presented the proposal along with other items to make
>> PDF documents more accessible to the international ISO meeting and I'm very
>> pleased to report that the MathML proposal and most of the other
>> accessibility proposals were accepted.  It will probably be two or three
>> more years before these become part of an official ISO standard.  Objections
>> might be raised later on, but for now (and hopefully forever), it is part of
>> the ISO 32000-2 proposal.  This is a big step forward for math accessibility
>> of PDF documents.
>>
>> To find out more about this and other accessibility work that Design
>> Science is involved in, take a look at  our article How is Design Science
>> making math more accessible? and the other accessibility solutions pages at
>> the Design Science website.
>>
>>  I right away asked Ross Moore about it. Ross is a wonderful magician for
> PDF-from-TeX which nowadays has managed the australian society's works to
> even do in part copy-and-paste of math-formulæ from PDF, in TeX sources or
> Unicode chars. This is working impressively (see, e.g., a pdf from
> http://www.austms.org.au/Bulletin)
>
> I'm interested to ask Neil Soiffer, author of this blog-post, on this forum
> about the potential impact of putting MathML in PDF within an ISO standard.
> Is there anything planned to dictate the behaviour of selections and
> copy-and-paste? Are most of the PDF "players-implementors" ready to
> implement such? (for my world that would include Adobe for Acrobat, xpdf
> implementors, and Apple for Preview).
>
> thanks in advance
>
> paul

Received on Thursday, 15 January 2009 10:19:55 UTC