Re: YASP (yet another semantics proposal) from Neil Soiffer on 2020-06-19 (public-mathml4@w3.org from June 2020)

From: Neil Soiffer <soiffer@alum.mit.edu>
Date: Fri, 19 Jun 2020 15:24:55 -0700
To: Moritz Schubotz <schubotz@uni-wuppertal.de>
Cc: public-mathml4@w3.org
Message-ID: <CAESRWkDKga-GeQ-Ym-+yx=oCOQycEnnTZ+ok63bBYY8_oo5fvg@mail.gmail.com>
Moritz,

Thank you for your feedback. The proposal is very focused on the details of
how semantics would be added to Presentation MathML as that has been our
focus in the general meeting for the last few weeks. It was not meant to
address the broader picture of why, or how this fits into the ecosystem,
etc. I've added a little more at the start to hopefully clarify that. The
end of the proposal does refer to the issue of "naming" the functions as a
significant challenge that lies ahead. At a minimum, it would include those
function names defined by Content MathML, but *I* hope that we go much
further in naming. The predefined names and what they mean would likely be
released as a non-normative note that could be revised (added to) from
time to time. However that is just my opinion and they may be part of a
spec or done some other way.

We have not discussed the format/content of a 'known names' document. It
might be just a prose definition for each name as it is for (pragmatic)
Content MathML. At one point, I proposed using a wikidata link (similar to
what you reference) as a source of definition of semantic markup in the
Presentation MathML, but people on the call didn't like that. Perhaps it
will be used for a definition of what the names mean in a 'known names'
document. Other than producing a much shorter document, it ultimately
wouldn't be any more precise because most wikidata links are just prose
definitions.

A primary reason for adding semantics to Presentation MathML is because
Content MathML has not caught on and mixed/parallel markup even less so.
The latter is extremely verbose and very fragile in the sense that making a
change in the presentation or content invalidates the other part. Also,
because id's are used, you can't copy the markup and use it elsewhere in a
document. My proposal and most of the other proposals have tried to be
somewhat hand-authoring friendly so that publishers and others can
remediate documents that are ambiguous and end up getting read
inappropriately by screen readers; mixed markup fails miserably in that
sense. The proposal is also (hopefully) reasonably easy for other software
to generate.

The hope is that MathML creation tools will output semantic MathML when
possible. For TeX translators, this means taking advantage of whatever
semantics there is in the markup as SRE does, but have standardized
definitions so different software are all on the same page. David Farmer
has done a fair amount of work with PreTeXt along these lines and written
up macros used in different subject area textbooks
<https://docs.google.com/document/d/1cZnff5_fi_ucNyZ1ex2msmJLE55FAZD-QInkLYe8xiE/edit>.
I believe that Bruce and David have done some experiments with PreTeXt and
LaTeXML. Also, there is nothing in what we are doing that blocks DLMF from
continuing to link to their own definitions of functions; other software
however would not likely make use of those links.

If I failed to address any of your questions or comments, please let me
know.

    Neil


On Fri, Jun 19, 2020 at 9:34 AM Moritz Schubotz <schubotz@uni-wuppertal.de>
wrote:

> Hi Neil,
>
> I am still not sure that I understand what is the idea of this
> proposal. Who should use it for which purpose?
>
> Bruce and the DLMF team can express semantics using a cosine (and for
> mathematicians relatively easy to understand) set of LaTeX macros.
> For example
>
> https://dlmf.nist.gov/5.2#E1
>
> has additional semantics for the components. I think it would be great
> if one could use similar mechanisms to achieve that on other platforms
> such as zbMATH or Wikipedia and many other places.
> We could now say, everyone, shall use LaTeXML and the semantic LaTeX
> macros. But this is not the idea of standardization efforts. Instead,
> I would imagine a common notation for this kind of semantics.
>
> I am currently aware of three approaches for semantic annotations.
> 1) Use semantic LaTeX as the DLMF does
> 2) Use RDF tripels in addition to the LaTeX code as Wikipedia
> currently does
> https://en.wikipedia.org/w/index.php?title=Special:MathWikibase&qid=Q1899432
> 3) Generate semantics from the standard LaTeX as the speech rule
> engine does https://github.com/zorkow/speech-rule-engine
>
> All three can in theory capture that in your first example $A^T$ A is
> matrix and T is the transpose. However, what does transpose mean? For
> the DLMF it is https://dlmf.nist.gov/front/introduction#common.p2.t1.r8
> and for Wikipedia, it would be https://www.wikidata.org/wiki/Q223683
> both definitions could be modeled using contributed/unofficial content
> dictionaries http://ceur-ws.org/Vol-2307/paper52.pdf and
> http://ceur-ws.org/Vol-2307/paper51.pdf .
>
> Obviously the generation of the correct content MathML tree requires
> at least proper definitions for all the symbols involved. In the end,
> think in addition to the definition list, we want to have a semantic
> tree as visualized in
>
> http://vmext.wmflabs.org/ast-renderer.html
>
> generating this tree from only LaTeX and the Wikidata information
> requires additional information on the operators and symbols involved
> to disambiguate different possible content trees. Here LaTeXMLs model
> is a bit more straight forward, but still, the generation of correct
> content MathML output (which would be required for the visualization)
> is as far as I know not yet fully implemented.
>
> This is my view on semantics. Given that background, I do not understand
> how
>
> transpose(@matrix)
>
> would be helpful to understand that T denotes transpose. Is there some
> internal library, such as the Math glossary from Abdou Youssef
> https://doi.org/10.1007/978-3-319-62075-6_25 that defines that
> transpose is associated with T.
> At least this link would be usefull for the semantic tree
> visualization and interactive screensreaders such as chromevox.
> Or am I completely on the wrong track and your semantic annotations
> are not related the the things I was talking about?
>
> All the best
> Moritz
>
>
>
>
>
> http://moritzschubotz.de | +49 1578 047 1397
>
>
>
> On Fri, Jun 19, 2020 at 7:59 AM Neil Soiffer <soiffer@alum.mit.edu> wrote:
> >
> > Based on the call today, I've come up with a different proposal from
> Bruce's semantics proposal that may or may not be the idea that Deyan was
> thinking of. I wrote up something that mostly parallels what Bruce wrote
> and copied most of the examples he had and marked them up with this new
> proposal.
> >
> > It is at:
> https://mathml-refresh.github.io/mathml/docs/function-semantics
> >
> > Feedback is welcome. I'm sure there are plenty of typos and maybe some
> things that aren't clear; hopefully it is understandable. We'll go over
> this next week along with any other proposals people come up with.
> >
> >    Neil
> >
> >
>
Received on Friday, 19 June 2020 22:25:19 UTC