Re: sources and outputs from Deyan Ginev on 2021-07-14 (www-math@w3.org from July 2021)

From: Deyan Ginev <deyan.ginev@gmail.com>
Date: Wed, 14 Jul 2021 10:45:11 -0400
To: David Farmer <farmer@aimath.org>
Cc: "www-math@w3.org" <www-math@w3.org>
Message-ID: <CANjPgh8b-cN0VVwhcnKB2p-T9BEOwa75PXpzKVzUFPGufe7J9A@mail.gmail.com>
Hi David,

If I left the impression that I expect the Intent specification to
automatically remediate arXiv, I apologize, that thought never crossed
my mind. In fact I first want the opposite - a simple presentational
baseline so that the existing MathML for arXiv can produce AT readouts
that are not conceptually misleading, but simply convey what was inked
on the page. Then I can build on that with the partial efforts you
describe in 3).

I also want the Intent specification to offer the capability of
annotating/remediating the full breadth of mathematical notations
people use today. To that end, using arXiv is quite handy in fishing
out examples to illustrate how rich and diverse the notations have
become in academic practice.

The discussion in the other thread about the role of Unicode still
needs to be settled if we are to "focus on specifying the right way to
encode the intent, for those cases where complete information is
available to the system doing the encoding". It is a different
recommendation to say either of:
 - if your colon means ratio, annotate with intent="ratio"
 - if your colon means ratio, use the Unicode U+2236

As to "default remediation", possibly targeted at K-14 educational
materials, that was something Neil, Sam and Charles were interested
in, and we are yet to properly investigate. I'm skeptical that the
subject areas are the right tool to use there, again mostly since I
can fish out counterexamples that are hard to pin down to a single
subject, already in K-14.

Greetings,
Deyan



On Wed, Jul 14, 2021 at 9:50 AM David Farmer <farmer@aimath.org> wrote:
>
>
> The current braille thread has mentioned the issue of how
> one interprets ambiguous markup.  I am starting this thread to
> ask to what extent ambiguous markup is within our scope.
>
> How will MathML come to us?  I see three ways:
>
> 1) A suitable authoring markup language, or a suitable editing
> environment, will produce content which encodes the meaning of
> the source mathematics.  That source will be transformed
> (somehow, and in a way we don't have to worry about)
> into MathML with intent information, following the rules which we
> will eventually agree on.
>
> Assistive technology, once it is made aware of the new rules,
> will be able to perfectly pronounce the resulting MathML.
>
> I think this is the most important case to support.  Our job is
> to describe rules for the end product:  MathML with intent information.
>
> ------
>
> The other extreme is:
>
> 2) Legacy material existing in the wild, and new material created
> with no thought about intent.  For example, Deyan's millions of equations
> in arXiv.
>
> This is a problem which has been around for a long time.  People like
> Neil S. have heuristics which do well in many common cases.
>
> I don't see that this is our problem.  As individuals many of us want
> to do something about this case, but why it is our job to say what to do
> with every bit of MathML that ignores the rules we are going to devise?
>
> -----
>
> And there is a broad middle case:
>
> 3) Legacy material, or new material, which is ambiguous but which could
> be improved by a small amount of editing.  This could involve adding
> "topic" information, such as "multivariable calculus".  (I don't think
> it matters whether the editing is by a human, or by a machine
> implementing the heuristics from 2) above.)
>
> Such efforts would decrease the number of ambiguous cases, but not
> eliminate them.
>
> I could go either way on whether this is our problem.  It would be
> helpful to provide some general principles.  But I see a difficulty
> avoiding the slippery slope of codifying all the heuristics of 2).
> And even if we did that, we know that misinterpretations would
> still be common.
>
> -----
>
> So my question is:  should we just focus on specifying the right
> way to encode the intent, for those cases where complete information
> is available to the system doing the encoding?
>
> And if we decide that we should also offer some advice on how to deal
> with legacy/ambiguous MathML, how far should we go?
>
> Regards,
>
> David Farmer
>
Received on Wednesday, 14 July 2021 14:45:50 UTC