Minutes: MathML Full meeting, 20 July, 2023

 Attendees:

   - Neil Soiffer
   - Louis Maher
   - David Carlisle
   - Moritz Schubotz
   - Johannes Stegmüller
   - Bert Bos
   - Deyan Ginev
   - Steve Noble
   - Bruce Miller
   - Sam Dooley

<https://sandbox.cryptpad.info/code/inner.html?ver=5.4.0-rc5#cp-md-0-regrets>
Regrets
<https://sandbox.cryptpad.info/code/inner.html?ver=5.4.0-rc5#cp-md-0-agenda>
Agenda
<https://sandbox.cryptpad.info/code/inner.html?ver=5.4.0-rc5#cp-md-0-1-announcements-updates-progress-reports>1.
Announcements/Updates/Progress reports

JoS: - Wikidata annotations in Wikipedia for math. Formula to generate
Intent attribute for accessibility (Johannes Stegmüller)

NS: There are elements in MathCAT that might help JoS with his work. NS and
JoS will discuss this after today's meeting.

From Johannes Stegmüller (FIZ-Karlsruhe) to Everyone:
https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Mathematics

Summary of the talk by JOS:

Wikipedia source can have annotating QID for linking a formula page
https://en.wikipedia.org/wiki/Schr%C3%B6dinger_equation Here in the
edit-source a math formula can be found with "math qid=" Clicking the
formula would lead to a Wikipedia Special Page
https://en.wikipedia.org/w/index.php?title=Special:MathWikibase&qid=Q165498

This page is composed from in-defining formula properties from the
corresponding Wikidata entry: https://www.wikidata.org/wiki/Q165498

There is currently a discussion with the en.wikipedia authors about this
(Formula Wikidata Association), they are against annotations from Wikidata
for formulas:
https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Mathematics

With a prototypical chromium extension, we plan to fetch in defining
formula info from an annotated Wikipage and create intents with a mapping
in the chromium extensions source code.

We created an initial dataset with Wikipedia pages with formula and
corresponding Wikidata entries based on Intent List, this could represent a
first evaluation dataset:
https://docs.google.com/spreadsheets/d/1ugMjTiPYCe1NGOVr3Ug597aYegLcp3cSREXZuzKygd0/edit#gid=1358098730

To evaluate the quality of the annotated intents with Wikidata we plan to
use MathCAT browser extension to read the annotated formulas with intents
and compare it to synthesized speech without intent by speech rule engine.

Central questions currently:

   - does it make sense to annotate with Wikidata or use direct annotation
   within Wikipedia ?
   - how to generate the mapping of Wikidata properties and symbolic
   references to intent attributes within the chrome extension ?

We would be happy to have some feedback and bring this up in the W3C
mailing list.

MoS: https://w.wiki/75nt shows a list content MathML entry linked in
Wikidata with translation to different languages.

MuS: This is one way to solve a major problem that I have seen with the
entire intent approach and that is we need some way for authors to get this
information in and this seems like a genuinely nice conduit for that.

MuS: I also really want to get it into a LaTeX package of some kind that we
can standardize on and make it easy for authors to input the information
that way as well.

NS: Dg was big on Wiki Data as a source. So, have you spent time looking at
its stability and fragility?

DG: That's not my focus. I like it as a source.

DG: It is for their community to decide, and I am not part of the community
now. I am just listening curiously.

MuS: Wikipedia has a lot of vetting already and I would think that the
vetting could include checking out the QIDs.

NS: Well, I am guessing that what the Wikipedia people are saying is yes,
we have a board that controls ultimately whether something stays or goes in
Wikipedia, but are they saying there is nothing for that Wikidata?

NS: It is a wild wild west and so it is not safe for us.

(LM: At this time, I lost the transcript from 12:20 PM through 12:28 PM.)

DG: I am outside of the whole thing. I do not have any answers for you, I
just have questions.

NS: So, I mean I just went to the wiki data page for Schrodinger's
equation, and I am confused because you said there were seventy-two
translations.

NS: Oh yes, but they are all there. There is a bunch more, but they are
with no description defined for them.

NS: There are not that many actual translations there. There is an
impressive amount, but it is not seventy-two because most of the entries
have no description.

MoS: There is a link to 72 Pages, and there is even more if you click show
all languages. There could be even more; but, all these languages have the
label, so the name of the Schrodinger equation is translated.

MoS: I mean you see you see four in the beginning and then you must click
again and click and show all languages and then you see them all.

NS: So, it is the label, as opposed to the description, which is the
important thing.

NS: So, DC, you were at a tech conference. Was there any discussion on
intent there or not?

DC: Yes, in some dinners.

DC: Also, there was a lot of discussion about PDF tagging.

DC: For the outside world who are not on the Zoom calls, intent is not
really a thing.

BM: The whole universe of issues relevant to readers and authors is
effectively paralyzing for someone like me trying to design Such a set of
semantic macros.

BM: When I think of what it is that you are trying to do in Wikipedia, I
would almost think that the more interesting approach would be that all the
macros must convey is the Qid, for a concept into the you know the wiki
data. And then leave it to tools like you are developing.

BM: So, the macro mechanism could be simple. You just associate this QID
and leave it to JoS's tools.

NS: Somebody was complaining that Mathcat was reading a simple matrix
definition wrong. So, it was reading A sub eleven and A sub twelve instead
of A sub (1,2) and A sub (2,2). Well, it was a sub 1,2 and sub 2,2, but the
author did not write it that way.

NS: You cannot stop authors from writing something that looks OK but is not.

NS: was looking for conclusions to tell JoS.

DC: I mean, other than offering broad support, it would be hard to give any
specific advice.

NS: Moving to another topic: BB: is there any progress with the extension?

BB: I do not have news. I was away for a few days. I do not think anything
happened.

NS: There was some discussion, and I did not reply to it.

NS: I just checked the stuff in and did not reply to their questions.

BB: Yeah, I saw your changes. I have a question for you: Should we change
the success criteria to the generic test text from the template, or should
we keep the text that is currently there?

NS: There were some things on the success criteria that caused questions.

NS: I will put a link in the minutes about it.

NS: Some people said there were some technical things wrong with the
charter. For example, the boiler plate language for patents in the charter
may not be current.

DC: Is this problem for the new charter, or just for our extension?

NS: For the new charter. We are officially unchartered, and I do not know
if we got an extension for the old charter.

*ACTION* NS: I better check to see that their questions are answered.

*ACTION* BB: I should also check the status.
<https://sandbox.cryptpad.info/code/inner.html?ver=5.4.0-rc5#cp-md-0-2-creating-work-assignments-to-move-things-forward->2.
Creating work assignments to move things forward.

NS: Not a lot of time is being spent on moving things forward. We Are in
the summer doldrums.

NS: We need more entries in the properties and intent tables. No progress
has been made in July.

NS asked for ideas on how the process of filling out the tables could be
restarted.

DG: For the open lists, I have an interesting new pastime which is talking
to GPT4, and there is a chance that it should be able to double the number
of things we have there.

NS: I did not want to be negative about chat GPT 4, but it can be wrong,
and you really do have to check. It can sound authoritative and be wrong.

We discussed the free and paid GPT programs. The paid GPT 4 program gives
you about fifty requests every three hours.

LM: GPT3.5 was last updated in the summer of 2021. Microsoft will give you
GPT4 for free.

NS: We need a plan to fill out the tables and complete the specification.

NS: It would help if people had specific assignments.

DC: How much effort would it be to use MathCAT as a rough draft for the
tables.

NS: MathCAT is not fully implemented and may not be useful in making these
lists.

DC: There is enough information in MathCAT.

NS: MathCAT has a limited number of examples and could be inspected by
hand. You do not have to write code to review MathCAT.

NS: You could read the MathCAT code and pull out 20-30 examples for the
tables.

DC: We have not published a working draft of the spec since September of
2022.

NS: We should publish one.

NS: Although we are not chartered can we update the working draft?

NS: There is no formal process for updating the charter. We just need to
agree on things then update it.

BB: As long as the update process works you can update the spec.

BB: Because we are not chartered, eventually our name will be removed from
the database, but that is not yet the case, I think. So, it should still
work here.

DC: We should look at low-hanging issues before we publish a draft.

SN: Can we not update the charter as part of the MathML Refresh Community
Group and not as a chartered group?

NS: this will happen if we are not rechartered.

DC: I do not want to move the repos out of W3C space and then back again.

DC: Because the charter is on GitHub and is public, there is a tendency not
to publish.

NS: We will continue this discussion next week.

NS: will work on the charter, and someone will work on the tables.

NS asked DG to help on these issues.

Received on Monday, 24 July 2023 00:58:07 UTC