Re: A proposal for a computer readable mathematical object notation from Neil Soiffer on 2018-03-29 (public-mathonwebpages@w3.org from March 2018)

From: Neil Soiffer <soiffer@alum.mit.edu>
Date: Wed, 28 Mar 2018 21:26:09 -0700
To: Arno Gourdol <arno.gourdol@gmail.com>
Cc: Han <laughinghan@gmail.com>, mathonweb <public-mathonwebpages@w3.org>
Message-ID: <CAESRWkBx+F3YNq-fJv+kPyXVz_ic8SoEEu25nMJpuEvN=_aovA@mail.gmail.com>
Sorry to be so slow to answer, but between travel and a forgotten deadline
to get a paper done, I haven't had time to respond until now... and I only
have time to answer the higher level issues...

To begin with, JSON and XML *should* be equivalent in what they can
represent. I find it surprising that the standard json representation
messes up ordering. Many people including Volker feel this as a mistake (or
at least not useful) and it is easy to find libraries that preserve order
by using arrays instead of objects. Given one of those, it is trivial to
convert MathML to a json format that makes evaluation basically as trivial
as the example Arno showed. The difference is that array indexing is used
instead of keys. If we get past the (IMHO) irrelevant issue of JSON vs XML,
you can move to what the tree should have to represent the expression.
That's where you end up with differences in all the programs that parse
AsciiMath, TeX, MathML, etc -- everyone has a different idea of what info
they want to have in that internal tree. E.g., an editor probably wants to
keep around some positional information where as something that generates
speech would not.

I haven't looked in detail about what info SRE adds, but looking at a
simple example from Volker's site I can see there is a mixture of semantic
and presentation information, along with ids that perhaps allow for
something richer than a tree.

To respond to Han: the problem with most math that is encountered in
documents (including web pages) is that it was entered by the author
without adding semantic information. No one wants to type in an editor and
say "this 'x' is a real valued number" just so that some evaluation down
the line knows how it interacts with some other variable. Indeed, people
don't event want to say whether h(x+y) is function application or
multiplication. Computational systems solve that problem because their
syntax defines what is meant. Neither TeX nor most WYSIWYG editors provide
that information in general. Somehow, humans mostly figure meaning out when
they read an expression. Using various patterns and context, MathPlayer's
speech rules and Volker's work take a few steps in that direction. It's not
easy though and having a library call that everyone could use would be
really useful. But that leads one back to my basic point -- what should be
in the tree that is built?

    Neil


<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.
www.avg.com
<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Fri, Mar 23, 2018 at 9:40 PM, Arno Gourdol <arno.gourdol@gmail.com>
wrote:

> Great feedback! Comments below inline.
>
>
> On Mar 22, 2018, at 11:26 AM, Neil Soiffer <soiffer@alum.mit.edu> wrote:
>
> I think the situation you describe about interchange is not anywhere near
>> as bad as you think. Most math programs will except TeX or MathML as input
>> and can do computation with them (at least Mathematica and Maple do so with
>> MathML, and I think also TeX). Part of the reason those programs can do so
>> is because they already have a linear syntax and so inferring a bit of
>> additional information is not that bad.
>>
>
> Indeed. I’m not suggesting that for Mathematica of Maple these are
> problems. Those guys have probably already solved everything! :)
>
> But think of the little guys! Think of the mathematician, the physicist or
> even the high school student who wants to write their own math software on
> the web, using the tools of the web. Let’s say they want to do some
> symbolic manipulation. Perhaps comparing two formulas. Or indexing them and
> searching them. Or implementing some numeric algorithm.
>
> The first step they would have to do would be to transform the input, be
> it MathML, TeX or ASCIIMath, into something that can be computed upon. That
> involves transforming a linear stream of tokens into a structured
> mathematical representation. And indeed today, every single math related
> software library does that. I even do it three times in MathLive (once to
> convert to speech, once to convert to MathML and once to convert to MASTON.
> If I had been smarter, I would have started by converting to MASTON first
> and having a single set of consistent rules on how to interpret the input
> stream).
>
> Not only is there tremendous wasted effort, but the results are
> inconsistent, as different rules are implemented to do the conversion from
> the ambiguous token stream. I believe the ambiguity should be solved once,
> when the formula is created, which seems to be the best time because:
> 1/ the user is available then to clearly communicate their intent
> 2/ the formula will be created once, but consumed many times
>
>
> As an example, there’s a large portion of MathJS which is spent parsing a
> pseudo-MathASCII syntax into something they can actually compute on.
> Wouldn’t it be nice if they could focus on the computing aspect, rather
> than having to implement yet another parser that may or may not have the
> implicit rules you expect (is “2 1/2” 2.5 or 1?).
>
> As for your proposal, I don't see what it adds that "properly" structured
>> MathML doesn't already provide. My properly structured MathML, I mean
>> MathML where the mrows reflect the abstract syntax of the mathematical
>> expression. E.g, for a binary infix operator, the first element in the mrow
>> will be an operand, the second the infix operator, and the third an
>> operand. If the second operator is a relational operator such as '=', then
>> the lhs is the first element and the rhs is the third element. The elements
>> might be leaf nodes such as mi's or complicated expressions inside mrows.
>> MathPlayer does this to infer semantics as does (I think) Volker's Speech
>> Rule Engine.
>>
>
> It seems to me that there’s quite a bit that needs to be clarified about
> what is a “properly” structured MathML. Perhaps there is a specification
> for that somewhere? I assume it would restrict things like
> *<mn>twenty-one</mn>* and* <mn>XXI</mn>*? If such a thing exist, modulo
> the label on the keys of the object, I believe you would indeed probably
> end up with something like MATSON. In that case, just consider it an
> attempt at actually documenting this :)
>
>
> {  "mrow": {    "msup": {      "mi": "ⅇ",      "mrow": {"mi": ["ⅈ","π"],"mo": "⁢"}    },    "mo": "+",    "mn": "1"  },  "mo": "=",  "mn": "0"}
>>
>>
>
> I believe that having a JSON format is really important. XML applications
> may be wonderful, but they are still a major pain to parse. Here’s the
> complete implementation of a four-operation calculator on a MATSON
> structure:
>
> function evaluate(ast) {
> if (!ast) return NaN;
> if (typeof ast === 'number') return ast;
> if (ast.num !== undefined) return evaluate(ast.num);
> if (typeof ast === 'string') {
> if (ast === 'π') return Math.PI;
> return parseFloat(ast);
> }
> if (ast.op === '*') {
> return evaluate(ast.lhs) * evaluate(ast.rhs);
> } else if (ast.op === '/') {
> return evaluate(ast.lhs) / evaluate(ast.rhs);
> } else if (ast.op === '-') {
> return evaluate(ast.lhs) - evaluate(ast.rhs);
> } else if (ast.op === '+') {
> return evaluate(ast.lhs) + evaluate(ast.rhs);
> }
> return NaN;
> }
>
>
> I shudder to think of writing the same thing on a MathML tree, even a
> ‘canonical’ one.
>
> That said, the mapping of MathML to JSON is not completely
> straightforward. MathML has made some choices that work well in a XML
> environment, but are just really difficult to translate in a JSON
> structure. For example, what would be the JSON version of
> *“<mi>x</mi><mo>=</mo><mi>y</mi>*”? You can’t have two identical keys in
> the same object, so it can’t be:
>
> {
>     "mi": "x",
>     "mo": "=",
> "mi": "y"       // second “mi” key! Can’t have that.
> }
>
>
>
> You can’t rely on the order of the keys to figure out what is the lhs and
> the rhs on an operator. So, you need something a bit more structured. I
> think if you work this through, you would indeed land on something close to
> MATSON. I’d be happy to call it “MathML for JSON” if that helps :)
>
>
> Note: although many operators such as '=' are formally binary infix
>> operators, in practice, they are better treated as n-ary operators because
>> they often are used as "a = b = c" or even "a = b <= c = d" and both layout
>> and computation are often easier when all are at the same level.
>>
>
> That is a great point. I’ve been debating this, and had erred on the side
> of keeping things as simple as possible, but I can see that point.
>
> BTW, I can see this working well for associative operations (addition,
> multiplication, maybe equality), but how would this work in the example
> where you have different relational operators?
>
>
> I believe several people have tried at various times to formulate a
>> "canonical MathML" format that includes properly structured mrows, one of
>> the canonical Unicode forms, and fixing up pseudo scripts
>> <https://www.w3.org/TR/MathML3/chapter7.html#chars.pseudo-scripts> among
>> some other things. Note that proper mrow nesting depends upon guessing
>> between function call and multiplication if the author didn't indicate
>> which to use (e.g, in MathML, one should use the invisible operators), so
>> the parsing can be complicated. Of course, parsing alone doesn't provide
>> semantics. I believe that Volker adds some attribute that is equivalent to
>> adding something like a "mathrole" to various elements to specify semantics
>> -- some "obvious", some derived from other context, etc.
>>
>> I think it would be helpful to develop a tool that anyone can incorporate
>> to produce canonical MathML.
>>
>
> Well, maybe this is yet another attempt at defining this canonical MathML,
> in an easy to parse JSON format. I don’t want to reinvent the wheel, so if
> there are things that exist out there, I’d be more than happy to at least
> start from them (maybe I just need to JSON-ify one of those forms).
>
>
> On Mar 23, 2018, at 11:22 AM, Han <laughinghan@gmail.com> wrote:
>
> Do those "Most math programs" actually output this restricted, "properly"
> structured MathML? To me, the value in something like MASTON is not what it
> *adds* over MathML, it's what it *removes*. Volker's Speech Rule Engine
> is a whole project unto itself at least in part because inferring structure
> from MathML that ChromeVox encounters "in the wild" is so hard, no? That's
> also why Content MathML and OpenMath exist, because people have seen the
> need for this information to be part of the data structure and not have to
> be inferred, before. (I would be curious to hear if Arno considered those?)
>
>
> Yes, I did consider Content MathML. But as per above, I really wanted
> something that would be much easier to consume, and for me that meant
> having a JSON-based format. OpenMath seems interesting, but far from
> lightweight, and it’s also an XML application.
>
>
> (Disclaimer: I specced out and built a toy parser for a JSON format for
> display math, analogous to TeX and Presentation MathML but in contrast to
> MASTON and Content MathML and OpenMath, that I called MathSON
> <https://gist.github.com/laughinghan/4350e4438e6cfc951826>. I haven't
> decided if it's any good.)
>
>
> I can see some similarities (convergent evolution!) but also some
> differences. It seems that MathSON would be suitable as an internal data
> structure to represent and manipulate a formula while being edited. For
> example, you still have some string of tokens, while I believe that the
> relationship between tokens (be it multiplication, function application or
> what have you) should be resolved by the producer of the data structure, so
> that the consumers don’t have to ‘guess’. Frankly, I don’t know if the same
> data structure can be used for editing and for semantic representation. In
> particular you would have to deal with the tricky problem of representing
> selection range that cross semantic structures. Dunno, maybe there’s a way
> to bridge that gap… Still would be nice if MathQuill could at least output
> to MATSON :)
>
> Best,
> Arno.
>
>
>
>
Received on Thursday, 29 March 2018 04:50:46 UTC