Re: a minimal core intent proposal from Neil Soiffer on 2022-11-10 (www-math@w3.org from November 2022)

From: Neil Soiffer <soiffer@alum.mit.edu>
Date: Wed, 9 Nov 2022 23:04:52 -0800
To: Bruce R Miller <bruce.miller@nist.gov>, Deyan Ginev <deyan.ginev@gmail.com>
Cc: www-math@w3.org
Message-ID: <CAESRWkBn7vebjA5p02iNQmo+1v3d7m7WbCPLrkcwApk5ajE==Q@mail.gmail.com>
Thanks to both of you for your comments (and I hope to hear from some
others in the group...). This needs to be a group effort.

The proposal definitely needs work, and certainly the names can be
improved. Like you, I think a key for the choice of a name is that it
should be speakable/understandable if the AT just speaks it.

I guess I wasn't clear on the defaults. I did not propose
"power-with-subscript" or any similar name. My proposal applies the msubsup
*element* if no intent is given. One thing I had not considered but I think
merits discussion is allowing the use of the element names as an intent
name. For example, intent="msubsup($base, $sub, $sup)". If an element name
is used, then AT should do exactly what it would do if it encountered the
element and it had no intent. I like this because it gives a way for
authors to use some alternative notation but say it really is like this
other notation. However, this is an idea in search of a useful example.

As to msubsup and largeops, that's a case I missed and I think a special
case for them should be given as is done for munderover. I have updated the
doc with that.

I don't think I follow your (Deyan) ideas about opting in or out of some
defaults. I completely understand that a lot of superscripts in arXiv are
not powers. We can tweak the defaults so that power is not used for certain
groups of characters for superscripts (the defaults already are tweaked to
include pseudo-scripts), but the point of the defaults is that it says
something right in common cases for the math that probably occurs 99% of
the time on the web (and ebooks, etc) and if the default doesn't match, use
intent. Maybe for arXiv, you will almost always put out
intent="superscript@infix($base, $super)" because you are dealing with
higher level math. I don't see a problem with that. Anyway, something to
talk about at the meeting.

Bruce wrote:

> BUT, with my LaTeXML hat on, where I take much abuse for sticking
> InvisibleTimes between things
> that aren't actually multiplied, I very often don't know whether a given
> superscript is a power or what it is.
> So, should I use an intent="superscript" ?
>
> I'd be more inclined to have default speech being more literal,
> meaning-agnostic, so that msup
> without intent would be spoken as "x superscript y" (or whatever the
> preference is).
> Of course, there is still room (and need) for some kind of domain hints.
>

David F probably has the most experience here, but I don't think it is
likely TeX authors want to type \power{x}{2} when they can type x^2. It
seems to me much more likely that some special notation with a superscript
has some name to it and a macro to generate it, and that is where they will
say "superscript" if that's what they want, but I suspect they will want to
voice it without superscript.

A goal I have is "keep it simple". In particular, that means simple things
should be simple to generate.

With my DLMF hat on, where there're lots of intervals, it pains me to think
> of
>    intent = "open-interval@silent(_open_interval_from,$a,_to,$b)"
> on *every* interval.  This leads me to wonder if some sort of "Intent
> Speech Rules"
> could be feasible.  An author (or publisher) might define a set of rules
> like:
>     open-interval($a,$b) ==> open-interval@silent
> (_open_interval_from,$a,_to,$b)
> to customize speech patterns
>

That's adding a lot of complexity to intent. It seems to me (and it is easy
for me to say this since I'm not writing TeX-to-MathML translators that use
intent), that once you give an author a way to say "speak open intervals
this way", then there is no problem including that everywhere they are
used.. My proposal is to make open-interval, etc., be part of core because
it doesn't fit into what we have defined for intent. Hence, if you don't
want LaTeXML to generate
  intent = "open-interval@silent(_open_interval_from,$a,_to,$b)"
and instead just generate
  intent = "open-interval($a,$b)"
then we should leave the interval notation (and anything else that doesn't
fit the intent patterns we have/will define) in core.  I mentioned the
open-interval@silent option because it is a way to minimize the number of
notations that we say AT needs to know about.

I agree with the concern about "unit" and intent. That is probably better
using (the not yet defined) "isa" attribute.

I had one other thought: I think we should add a "uses-intent" (name to be
figured out) argument to <math>. Right now, if "intent" is used, the author
has specified the speech. But if intent is not given, then (with my
proposal) the defaults should be used. But AT doesn't know if the MathML
was generated by an intent-aware tool or not. With "uses-intent" (default
"false"), then the author is declaring that AT should use the defaults. If
"uses-intent" isn't present (legacy and non-semantic authoring tools), then
AT would be free to use heuristics to infer intent. A simple example is
x^T. MathCAT will infer that "transpose" is meant. If uses-intent is
true/present, then it should be "power". If the author had meant
"transpose" (or something else), they would have indicated that via intent.
"uses-intent" becomes a way to distinguish between intent-aware generation
and legacy generation.

Lots to talk about at the meeting.

   Neil


On Wed, Nov 9, 2022 at 1:00 PM Bruce R Miller <bruce.miller@nist.gov> wrote:

> Let me second Deyan's thanks for a nice draft for discussion.
>
> If I understood correctly, you're wanting to assume that msup (and other
> superscripts)
> are powers, by default; if they aren't the author must override the
> interpretation using intent.
> That certainly covers a large use-case concisely...
>
> BUT, with my LaTeXML hat on, where I take much abuse for sticking
> InvisibleTimes between things
> that aren't actually multiplied, I very often don't know whether a given
> superscript is a power or what it is.
> So, should I use an intent="superscript" ?
>
> I'd be more inclined to have default speech being more literal,
> meaning-agnostic, so that msup
> without intent would be spoken as "x superscript y" (or whatever the
> preference is).
> Of course, there is still room (and need) for some kind of domain hints.
>
> With my DLMF hat on, where there're lots of intervals, it pains me to
> think of
>    intent = "open-interval@silent(_open_interval_from,$a,_to,$b)"
> on *every* interval.  This leads me to wonder if some sort of "Intent
> Speech Rules"
> could be feasible.  An author (or publisher) might define a set of rules
> like:
>     open-interval($a,$b) ==> open-interval@silent
> (_open_interval_from,$a,_to,$b)
> to customize speech patterns?
>
> And finally, I also worry about stretching the intent syntax too far;
> @hint is already troubling
> enough (though I like it).  It's not quite clear how intent="unit" should
> work.
> Perhaps
> <mrow intent="unit($n,$unit)">
>     <mn arg="n">3</mn>
>     <mi arg="unit">cm</mi>
> </mrow>
> is more workable ?
> Alternatively, I'm liking Deyan's ISA proposal for things like units,
> currency.
>
> Other than those things, I like it :>
> bruce
>
> On 11/9/22 12:54, Neil Soiffer wrote:
> > I wrote a proposal <
> https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fw3c.github.io%2Fmathml-docs%2Fminimal-intent-core&data=05%7C01%7Cbruce.miller%40nist.gov%7C000d19e94db34ece387808dac27bb67e%7C2ab5d82fd8fa4797a93e054655c61dec%7C1%7C0%7C638036133850250907%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1lf3XEwEGyomn8cPF6J%2FFFvrS4fiqHzZGEu4PN3K1v8%3D&reserved=0>
> for simplifying what goes into intent core. It ended up being sort of an
> "AT requirements" document for core. If I extend it a little further to
> include what AT should do with "intent" (currently just presumed everyone
> knows), it would be the basis for an actual AT requirements document (or
> appendix). It also serves to let authors/authoring software know what they
> can count on as default behavior by AT.
> >
> > The proposal contains some open questions, but I believe it is fleshed
> out enough that it is understandable and actionable (let's do this/don't do
> this). It extends what I put in Deyan's intent spreadsheet and also has
> explanations. It will be the basis for the third agenda item on Thursday.
> >
> >      Neil
> >
>
>
> --
> --
> bruce.miller@nist.gov
> http://math.nist.gov/~BMiller/
>
>
Received on Thursday, 10 November 2022 07:05:15 UTC