Minutes: MathML meeting 30, Nov 2023 from Neil Soiffer on 2023-12-06 (www-math@w3.org from December 2023)

From: Neil Soiffer <soiffer@alum.mit.edu>
Date: Wed, 6 Dec 2023 13:33:46 -0800
To: "www-math@w3.org" <www-math@w3.org>
Message-ID: <CAESRWkBTn0T1vMUq93bA42PjVEU1ftz2rtFWSPc4f4uQLKSu9A@mail.gmail.com>

Attendees:

- Neil Soiffer
- Louis Maher
- Moritz Schubotz
- David Carlisle
- Paul Libbrecht
- Deyan Ginev
- Bruce Miller
- Murray Sargent
- Cary Supalo
- Bert Bos

<https://sandbox.cryptpad.info/code/inner.html?ver=5.5.0-c#cp-md-0-regrets>
Regrets
<https://sandbox.cryptpad.info/code/inner.html?ver=5.5.0-c#cp-md-0-agenda>
Agenda
<https://sandbox.cryptpad.info/code/inner.html?ver=5.5.0-c#cp-md-0-1-announcements-updates-progress-reports>1.
Announcements/Updates/Progress reports

NS: BK is going to try and push through to a CR for core. We did get an
update on core and the working draft so it's not a year old.

MoS: So we wrote out the native rendering as an option on the German
Wikipedia, and we see that Even though we have the same output that also
mass checks produce this. There are issues with the Chrome rendering. Is
there a list of known bugs with Chrome which we could link to? People are
unclear if it is a bug in the browser or in our implementation.

From Deyan Ginev to Everyone: I think the issues are here:
https://bugs.chromium.org/p/chromium/issues/list?sort=status&q=componentABlinkEMathML&can=2

NS: asked the group to put any bug we find into the bug tracker.

DC: A list of math fonts can be found at:
https://fred-wang.github.io/MathFonts/mozilla_mathml_test/
<https://sandbox.cryptpad.info/code/inner.html?ver=5.5.0-c#cp-md-0-2-intent-pr-about-arity-max-5-minutes-updates>2.
Intent PR about arity (max 5 minutes) -- updates

PL: I was skimming through the rest of DG's list of intents, and I saw
several notations where some pronunciations go one way and sometimes go the
other way for the same intents.

NS: There is an order of argument pronunciation, and the AT needs to
understand this order, but that order may not be the order the term is
spoken.

NS: The speech is determined by the AT. The concept is sort of the
important part to know. I gave an example of a simple fraction in which you
would typically have the numerator as the first argument and the
denominator as the second argument and you would say A over B. But in Asian
countries, they would do it the opposite way. They'd say be under A. They
do agree that the first argument is the numerator, and the second argument
is the denominator.

NS said DG was willing to close this issue.

NS wanted to put some text at the beginning of the concept table saying
that the pronunciation will follow certain rules, and there would be a flag
indicating that this particular intent was an exception to the
pronunciation rules. Once this text was agreed upon, this issue would be
closed.

BM: You can only have exceptions to rules if you have rules, and we do not
have standards of order pronunciation.

NS: The pronunciation order should always be described. Perhaps we should
also describe other orders of pronunciation.

PL: Perhaps we cannot agree on the text to go into the concept table yet. I
should just put in a comment in the issue.

PL: Do we intend to deliver the concept list with the math model and
examples? Yes.

DC: It basically is delivered. I mean it's in the public URL that the spec
references.

NS: We may make a note on this issue. A note will be more prominent than a
document.
<https://sandbox.cryptpad.info/code/inner.html?ver=5.5.0-c#cp-md-0-3-other-issues->3.
Other issues?

No Arity intents? <https://github.com/w3c/mathml/issues/480>

NS discussed his implementation design that separates out speech for
Unicode chars from speech for notations.

NS: What I've found is that my Unicode implementation needs to be more
complete than my notation rules. That's because the notation rules fall
back to reading the underlying syntax or in a bad cases, capture more than
they should. An example of the latter is saying "power" when something is
just a superscript. Even when misreading something as a power, it can be
understood. For example, "x to the star power" will make someone stop and
think "what!?". But they will understand and move on. However, not having a
name for a Unicode character makes the speech extremely hard to understand.
For example, "Unicode 2 5 a b, A B C" is next to useless. On the other
hand, in practice, very few characters are used up through Calculus. In a
paper
<https://dl.acm.org/doi/abs/10.1145/3192714.3192835?casa_token=isHkl4Yi0V4AAAAA:8FDd6uYuRY-sW05w5AFoQkH8csvNQW26PH_llaEKliMzZR2UJKvhGWXiTuq90F5JHzWdyg9tuDDqyQ>
I wrote, I found that 50 non-keyboard characters (non-ASCII is a good
approximation) covered 99.95% of the characters that were used. Still,
encountering one of those 0.05% characters is a very poor experience. In
the absence of more analysis, I would encourage a more complete listing of
characters to include in core rather than a more restrictive listing. Both
creating the list and implementing the speech is much simpler than
documenting and implementing a notation that should be handled. It also
means authoring tools can mostly worry about intents on characters when
they have a special way of being spoken or are highly ambiguous (| comes to
mind). My guess is that not counting alphabets and script variants, 200 -
300 characters would be sufficient to cover >99.999% of all characters
encountered.

MUS: We got a bug where the times sign was not in the list, and it was
skipped. It is good to have a list.

DG: I think they have to be separate because that's not actually intent,
right? These are just Unicode entries and intent is a different animal
altogether, something like second which would be on the small letter S.
That's a Latin S but it's a different intent.

PL: suggested having one list but the list would be separated into themes.
You would separate the single list into mathematical categories.

NS: David did a clever thing where you can sort in different ways and maybe
something like that for the concept lists.

DC: We should keep them separate. The intents are mathematical concepts
while the Unicode characters can be quite visual.

NS: As an implementer it is good to have a list and to say that there are
look-alike characters that can be pronounced differently.

DG: In MathML 3, you did not have intents, you had Unicode and you needed a
list to interpret it.
<https://sandbox.cryptpad.info/code/inner.html?ver=5.5.0-c#cp-md-0-3-5-lm-at-this-time-mus-presented-a-discussion-based-on-these-links->3.5
LM: At this time, MuS presented a discussion based on these links:

MSS: Please see
https://github.com/MurrayIII/UnicodeMathML/blob/main/docs/MathML%20Intent%20Attribute.pdf
for more info on my UnicodeMathML implementation with MathML
intent-attribute support. You can run UnicodeMathML by clicking on
https://murrayiii.github.io/UnicodeMathML/playground/.

LM: This discussion was based on software running in real time.

NS: Hopefully you'll pay attention Murry to the core concept lists as we
develop it and then you can adjust the intent values as needed or provide
feedback and say, no, this would be a better name.

NS: suggested people pay attention to the Chrome bug list and try to raise
the priority of bugs of interest to us.

Received on Wednesday, 6 December 2023 21:34:03 UTC