Re: proposed ARIA role for math [DRAFT 1] from Simon Pieters on 2008-03-13 (public-pfwg-comments@w3.org from January to March 2008)

From: Simon Pieters <simonp@opera.com>
Date: Thu, 13 Mar 2008 10:17:01 +0100
To: "Neil Soiffer" <Neils@dessci.com>
Cc: "Aaron M Leventhal" <aleventh@us.ibm.com>, brewer@w3.org, "Gregory J. Rosmaita" <oedipus@hicom.net>, public-pfwg-comments@w3.org, unagi69@concentric.net, w3c-wai-pf@w3.org, w3c-wai-pf-request@w3.org
Message-ID: <op.t7x6qnijidj3kv@hp-a0a83fcd39d2.oslo.opera.com>
On Thu, 13 Mar 2008 07:50:19 +0100, Neil Soiffer <Neils@dessci.com> wrote:

>> But interoperability for all cases is desired, not just the common  
>> cases.
>
>
> Anyone, including the document an expression comes from can create or  
> delete
> macros.  Math is "just" another part of TeX that can be modified, contain
> embedded non-math TeX, macro definitions, etc.  Interoperability of
> arbitrary (aka all) TeX is simply not a feasible option.

Ok, then I guess I want a subset of TeX that only covers the math part and  
is fixed, with no way to opt out of the math part.


> Potentially naming
> the macro packages used (there can be multiple ones) might be of some  
> use,
> but as there are literally thousands of packages (tens of thousands?),
> that's probably not useful either.

I think it would be useful, because if we don't do it, the implementors  
will have to do it, and they will likely choose different ones, which  
results in interoperability problems.


> Most macro packages don't modify math
> mode, but since you can go back into text mode in math mode, you could  
> make
> use of them.

It would be possible to define a preprocessing layer that would ignore the  
entire expression if it tried to go into text mode or truncate the  
expression at the point it tries to go into text mode.


>> ...and MathML content is inherently math, and hence, doesn't need
>> role='math'.
>
>
> It may not need it, but what is wrong with using it?

If implementations are expected to do different things on role='math' if  
the element is a MathML element, then that complicates the implementation,  
and hence, has a non-zero cost. What's the benefit?


> Ideally, the browser
> would do the appropriate mapping for <math>, but IE doesn't do it and if  
> I
> read what Aaron said earlier, Firefox doesn't do that either.

Doing the appropriate mapping for <math> is, I would presume, simpler to  
implement than special-casing role='math' on <math>, and moreover, there  
is MathML content out there that doesn't have role='math', so doing the  
former would have greater benefit for current content on the Web than  
doing the latter.


>> I haven't seen it proposed before that authors should label which format
>> they use for their role='math' expression. Even though that would make  
>> it
>> possible to disambiguate different expression syntaxes, it's not the
>> simplest solution to the problem. The simplest solution I can come up  
>> with
>> is to say that the expression must be encoded as and parsed as LaTeX
>> (implying leading and trailing $ or $$).
>
>
> That shuts out other formats, and other TeX macro packages.

Yes, that's the point. :-)

role='math' is, AIUI, intended to solve one simple problem: MathML is not  
feasable for most authors today, and the alternative (<img src=formula.gif  
alt=expression>) is not very accessible. Saying that an <img> is "math"  
and how to interpret its alt text makes it accessible at a low cost for  
both authors and implementors. (To this end, we could define it to only  
work on <img>, but it would be nice if it worked with <object> and other  
elements as well.)


> Perhaps that is
> OK with others, but I'm a little concerned.

Why?


> Also, one needs to be careful
> when you say it is "LaTeX".  As I mentioned earlier, math mode in LaTeX  
> is a
> small part of LaTeX, which itself includes all of TeX, so LaTeX itself  
> can
> (and is) extensible.  To be truly useful, one needs to define precisely  
> what
> is allowable for "LaTeX" inside an element with role="math".

Indeed.


> You might want
> to look at something like "blahtex" (
> http://www.mediawiki.org/wiki/Extension:Blahtex).  This is basically a
> non-extensible list of what is allowed that corresponds to common math  
> usage
> in LaTeX.

Awsome. This is exactly what I'm looking for.


>> >> But how do you implement it? Should the UA autodetect whether it's  
>> TeX
>> >> or
>> >>
>> >> LaTeX or something else? How are authors supposed to know what to
>> write?
>> >> How do we achieve interoperability? What's the advantage of leaving  
>> it
>> >> open-ended?
>> >>
>> >>
>> >> > See the thread I started called 'New role="math" in ARIA, how to
>> >> author
>> >> > and how browser would expose it'
>> >> > In that thread we're discussing some of the remaining issues, and  
>> you
>> >> can
>> >> > see the current definition.
>> >>
>> >> The current definition doesn't seem to handle:
>> >>
>> >>    <object role="math" data="foo">a^2+b^2=c^2</object>
>> >>
>> >> Also, when would it be better to have the expression in another  
>> element
>> >> than as text in the element itself (i.e. when is labelledby needed  
>> for
>> >> role=math)?
>> >>
>> >> Finally, I don't know (La)TeX very good, but shouldn't $ or $$ be
>> >> implied
>> >>
>> >> around the expression?
>>
>> (I'm still not sure about the answers to these questions.)
>
>
> It could be implied if you say that only TeX is allowed.  Alternatively,  
> you
> could *define* that $...$ means it is some TeX-like syntax.
>
> I'm a little reluctant to force a particular syntax on everyone,  
> especially
> one that is as fluid as TeX is.  On the other hand, limiting math to  
> either
> some TeX-like syntax or MathML does make life simpler for implementors.

Right. I think that for the first version we should make it as simple as  
possible for implementors so that we can achieve interoperability from the  
start. If we find that we need to extend the format or make it extensible  
in the future, then there will still be room to do so.


Here's a new proposal for a wording of spec text:

    The 'math' role represents a mathematical expression. If the element is
    an HTML "img" element, then the "alt" attribute's value represents the
    expression; otherwise the element's textContent DOM attribute
    represents the expression. [HTML] [DOMCORE] The expression must be in a
    format that would be valid TeX if '$' was prepended and appended to the
    expression. User agents must first act as if they truncated the
    expression at the first '$' character, if there is one, and then
    prepend and append '$' characters to the expression, and then parse the
    result as TeX. [TEX]


Cheers,
-- 
Simon Pieters
Opera Software
Received on Thursday, 13 March 2008 09:29:51 UTC