Re: Math speech strings

Murray,

Thanks for sharing that list. The reason for asking for the list is that we
need to decide whether it is feasible to come up with a complete "core"
list of "intent" values that every generator can produce and that every AT
can then translate. If it isn't on the list, then it isn't likely it will
be translated. If we can't come up with a list that people feel covers
enough basic math so that non English-speaking people feel they aren't
forced to figure out some English math speech, then it argues that
generators should be the ones that do the translations since the document
in which the math occurs is in the target language and contains a known set
of "intents" that can be translated by the author or translator.

We previously agreed that intent doesn't need to be given for Unicode chars
by default because they are self-voicing. Of course, some are  ambiguous
and need "intent" to resolve their meaning, but most don't. Unicode chars
do need a translation, but because Unicode has defined the characters, they
are known and can be translated during development by AT. Having a list of
characters that occur in K-14 books would be a huge help to AT developers
so they know which of the tens of thousands of characters need to be
translated.

Unfortunately, the linked list is mostly Unicode chars. The non-unicode
part is the Math Function Speech Table. Not counting the ordinals and trig
functions (which need translations but are probably not given as "intent"
values), it is about 75 entries long. Many of these entries are notational
names such as "fraction", "subscript", etc. Some describe Unicode char
names (e.g., "sans-serif"). Taking those out leaves us with probably less
than 60 names, at least ten of which are positional words such as "lower
limit", "base", and "radicand" which I assume are used for navigation
(again, they need translation but aren't given via "intent"). So that
leaves maybe 40 - 50 names that would be used for "intent". These include
"matrix", "overbar", and "absolute value".

This list seems WAY too short to be considered as coming close to a
comprehensive list for translation.

Murray -- this is not a criticism of you or your implementation (which was
not developed with "intent" in mind). It is a statement that I don't
believe this is remotely sufficient to form a basis for a list of "intent"
values if we want to tell AT "this is what needs to be translated" so that
a (say) French translation is as useful as the English version when someone
uses "intent" for K-14 math.

As an example of what is missing, our favorite simple example of ambiguity
that intent can resolve is "(a, b)". This can be "point", "open-interval",
or "gcd" among other things. None of those are in the list. If those aren't
in the known list of intent values, they likely can't be translated
(dynamic translation by calling google translate is likely too slow).

If we feel that AT needs to do the translation, we need to get a
comprehensive "core" list of intent values. Does anyone in the group think
that is feasible?

If it is not feasible, the only alternative is to have a short list of
"core intent" values and have the spec state that "intent" values not in
that list will likely not be translated.

    Neil




On Sat, Jan 28, 2023 at 12:03 PM Murray Sargent <
murrays@exchange.microsoft.com> wrote:

> At our meeting this past Thursday, Neil asked me to provide the list of
> math speech terms used in OfficeMath
> <https://devblogs.microsoft.com/math-in-office/officemath/> speech. Here
> it is: Math Speech Strings and Localization - Math in Office
> (microsoft.com)
> <https://devblogs.microsoft.com/math-in-office/math-speech-strings-and-localization/>.
> The list isn’t exhaustive, but it includes the most common words in English
> math speech, and the discussion explains how the list is localized into ~18
> languages. At some point, I’d like to add more terms notably those that we
> identify with a core MathML intent attribute.
>
>
>
> Thanks,
>
> Murray
>

Received on Monday, 30 January 2023 05:03:54 UTC