Minutes: MathML general/semantics meeting Nov 19, 2019 from Neil Soiffer on 2019-11-19 (public-mathml4@w3.org from November 2019)

From: Neil Soiffer <soiffer@alum.mit.edu>
Date: Tue, 19 Nov 2019 15:10:31 -0800
To: public-mathml4@w3.org
Message-ID: <CAESRWkCieqq1PEq0GixQiUcBV+yNsdFGVx6AVHEo_XYm2fWiVw@mail.gmail.com>

Meeting was recorded:
https://benetech.zoom.us/recording/share/1hqif_WG9cr3Ll-fedy9HYlm8pWBvHE9sijK6h9d5guwIumekTziMw
Attendees:

Neil Soiffer

David Farmer

Bruce Miller

Sam Dooley

David Carlisle

Steve Noble

Informative vs Normative: mathrole

DC: I created a version of SD’s table and eliminated parts (in my opinion)
that had no redundancy (not obvious). and it went from ~1,100 entries to 29.

DC: It was a little overkill because I eliminated all but one of the units
(kg).

DC: It was mostly automated, so maybe that was overkill. But still, that’s
a big difference.

DC: It’s an issue in github
<https://github.com/mathml-refresh/mathml/issues/141#issuecomment-555438214>
.

DF: what about unary vs binary?

DC: that is easy to recognize from its position.

DC: Sometimes I put the negative in the mn (eg, -2)

DF: can I add it anyway?

DC: yes, but it is an anti-pattern. You then need to check that it is
right. That’s more work.

NS: putting -2 into mn something I found while I was looking at writing up
something on subject areas. I had to double the code in many cases to cope
with different markup <mn>-2</mn> vs.

Or people put \invisible times between ABC (even for angles)

Other examples eg parens around a table, might think it is obviously a
matrix, but 1x2 case might be \choose

Much mathml is generated “wrong” which makes it hard to infer meaning,

NS: do we define a canonical mathml that can take general mathml and write
it in standardised form?

DF: I’m in the pretext world. In my world, the authors are willing to make
small changes to let screen readers get it right.

DF: Aren’t you heading in the same direction.

NS: if the math is read incorrectly people blame the reading application
not the author, so these applications need to cope with many cases of poor
markup and can not assume that everything is well marked up initially to
support good readings

NS: Similar to the way browsers fix bad HTML. In olden times, each browser
did fix up differently. So someone would have bad markup, look at in their
favorite browser (my IE back then), and someone else would look at it in
Firefox and the page would look terrible. They would blame Firefox, not the
author. Today, HTML specifies fixes so all browsers deal with “incorrect”
markup the same.

NS: It would be useful to have a notion of canonical MathML so we had the
same thing and that all semantics would be interpreted the same.

SN: I agree it would be useful to have everyone be on the same page.

BM: I can sympathize with that. But you need to be careful.

BM: I also sympathize with DC point of view of not stating the obvious.

BM: Yes, “plus” means some sort of addition. Perhaps there should be a line
between that and “this is a plus and this is what it means”.

BM: We have to deal with two things in LaTeXML -- we require disambiguation
on one side and on the other side (e.g, arxiv), we don’t have a clue
sometimes. So we have to guess.

BM: Maybe we need a way to indicate whether I’ve specified what I mean.

NS: are you suggesting a flag.

BM: yes maybe something that says I’ve provided the semantics, don’t guess

DF: what if that is a subject area attr.

BM: I like that, but that’s maybe a research project.

NS: DF and I have been working on that. It is open ended, but like
mathrole, we can probably do 99% of what people encounter.

DC: If you read the syntax, you can understand it. A trained mathematician
can make a reasonable guess how to read something that they don’t
understand.

BM: What if you heard x^2 when it wasn’t really a square. Is that a problem.

NS: There is a verbose reading style called MathTalk that is based on the
braille Nemeth code. It says “x superscript 2 baseline”. But most people
are not used to hearing that. So they developed alternative pronunciations
that include some semantics.

NS: My belief is that people get used to hearing the wrong thing and then
translating it in their heads.

NS: I’ve never seen a study of a syntactic vs semantic reading style, but
all of the work I’ve seen has gone into doing a better job with semantics
reading because that is what people hear in classrooms, etc.

DC: But there might not be a “right way”. You might be reading it and
never having heard it pronounced.

NS: If there is a right way, as in logic where a fraction is not really a
fraction, we should pronounce it correctly.

NS: If the superscript is not a number, maybe you say “f star” and not “f
to the star”.

NS: So the annotation goes on the sup and not on the operator. Therefore
we need to allow annotation on anything.

NS: Non-english countries might use other notations, such as ]a,b[ for an
open interval in French as opposed to (a, b) in English. The MathML spec
lists 10 different long division styles. We need to bring in some
non-english people to help us out with ambiguous notations in other
languages.

SD: let me try to summarize…

SD: we want a mathrole attr, we want it to be optional, we want to be able
to put it on any mathml attr

SD: we’ve talked about an open ended list of values, but probably
informative. So maybe not put in the spec. Maybe into a note.

BM: in common areas, we want people to use the same name/value

SD: we want to have a collection of obvious defaults. They should be
obvious enough that in a large % of cases, you don’t need to specify the
mathrole for the intended semantics for most cases. Eg., mathrole=”plus”
for <mo>+</mo> is not needed.

SD: Maybe a way of saying mathrole=”unknown”

SD: What I heard from DF is that there is a lot of utility in providing a
subject area attr. That will allow you to override the defaults with other
defaults.

DF: There are things where there is no obvious default for a subject area.
But it mostly works. Where do I put the subject attr?

NS: The subject attribute needs to be on the math. We don’t control HTML
in general and often we can’t get the context.

DF: For an example with an arrow in a limit, I want to say “goes to”, but
in other cases, I want to say something else.

DC: It depends on the context.

NS: Sometimes it is obvious what the meaning is, but it takes work to
recognize. The arrow in a limit is an example of that. Do we need to mark
that up or you just need to write a more complicated pattern?

NS: A simpler example is parens around a mtable -- probably a matrix, but
you need to match the mrow with the parens and/or brackets and mtable.

NS: that’s a good summary

NS: what’s the next step?

DC: we need to write some stuff for a spec so if the TAG group looks at
full while looking at core, they need to know how full integrates with core.

NS: there are lots of things that have nothing to with semantics in full.

NS: Should we pivot away from semantics for a little bit to do work on full?

DC: we need to get something in there

DC & NS: discussion of what needs to be done and how to do it in full… two
versions of full at the moment. One to be the spec and the other to be
notes.

NS: DC and I probably should take the lead on paring down chap 3 and
pushing details of layout to the core doc.

NS: what about content and mixing markup

DC: I’d like to get rid of mixing markup and use mathrole. I don’t think
anyone ever produced mixed markup.

BM: I did, but I don’t know if I understood it :-) Not sure how many people
choose that option for LaTeXML.

DC: It was never normative, so I’d like to be moved to the notes version.

Next meeting in two weeks.

Received on Tuesday, 19 November 2019 23:10:43 UTC