Re: A proposal for a computer readable mathematical object notation

Great feedback! Comments below inline.


> On Mar 22, 2018, at 11:26 AM, Neil Soiffer <soiffer@alum.mit.edu> wrote:

> I think the situation you describe about interchange is not anywhere near as bad as you think. Most math programs will except TeX or MathML as input and can do computation with them (at least Mathematica and Maple do so with MathML, and I think also TeX). Part of the reason those programs can do so is because they already have a linear syntax and so inferring a bit of additional information is not that bad.

Indeed. I’m not suggesting that for Mathematica of Maple these are problems. Those guys have probably already solved everything! :)

But think of the little guys! Think of the mathematician, the physicist or even the high school student who wants to write their own math software on the web, using the tools of the web. Let’s say they want to do some symbolic manipulation. Perhaps comparing two formulas. Or indexing them and searching them. Or implementing some numeric algorithm. 

The first step they would have to do would be to transform the input, be it MathML, TeX or ASCIIMath, into something that can be computed upon. That involves transforming a linear stream of tokens into a structured mathematical representation. And indeed today, every single math related software library does that. I even do it three times in MathLive (once to convert to speech, once to convert to MathML and once to convert to MASTON. If I had been smarter, I would have started by converting to MASTON first and having a single set of consistent rules on how to interpret the input stream).

Not only is there tremendous wasted effort, but the results are inconsistent, as different rules are implemented to do the conversion from the ambiguous token stream. I believe the ambiguity should be solved once, when the formula is created, which seems to be the best time because:
1/ the user is available then to clearly communicate their intent 
2/ the formula will be created once, but consumed many times


As an example, there’s a large portion of MathJS which is spent parsing a pseudo-MathASCII syntax into something they can actually compute on. Wouldn’t it be nice if they could focus on the computing aspect, rather than having to implement yet another parser that may or may not have the implicit rules you expect (is “2 1/2” 2.5 or 1?).

> As for your proposal, I don't see what it adds that "properly" structured MathML doesn't already provide. My properly structured MathML, I mean MathML where the mrows reflect the abstract syntax of the mathematical expression. E.g, for a binary infix operator, the first element in the mrow will be an operand, the second the infix operator, and the third an operand. If the second operator is a relational operator such as '=', then the lhs is the first element and the rhs is the third element. The elements might be leaf nodes such as mi's or complicated expressions inside mrows. MathPlayer does this to infer semantics as does (I think) Volker's Speech Rule Engine.

It seems to me that there’s quite a bit that needs to be clarified about what is a “properly” structured MathML. Perhaps there is a specification for that somewhere? I assume it would restrict things like <mn>twenty-one</mn> and <mn>XXI</mn>? If such a thing exist, modulo the label on the keys of the object, I believe you would indeed probably end up with something like MATSON. In that case, just consider it an attempt at actually documenting this :)


> {
>   "mrow": {
>     "msup": {
>       "mi": "ⅇ",
>       "mrow": {"mi": ["ⅈ","π"],"mo": "⁢"}
>     },
>     "mo": "+",
>     "mn": "1"
>   },
>   "mo": "=",
>   "mn": "0"
> }


I believe that having a JSON format is really important. XML applications may be wonderful, but they are still a major pain to parse. Here’s the complete implementation of a four-operation calculator on a MATSON structure:

function evaluate(ast) {
    if (!ast) return NaN;
    if (typeof ast === 'number') return ast;
    if (ast.num !== undefined) return evaluate(ast.num);
    if (typeof ast === 'string') {
        if (ast === 'π') return Math.PI;
        return parseFloat(ast);
    }
    if (ast.op === '*') {
        return evaluate(ast.lhs) * evaluate(ast.rhs);
    } else if (ast.op === '/') {
        return evaluate(ast.lhs) / evaluate(ast.rhs);
    } else if (ast.op === '-') {
        return evaluate(ast.lhs) - evaluate(ast.rhs);
    } else if (ast.op === '+') {
        return evaluate(ast.lhs) + evaluate(ast.rhs);
    }
    return NaN;
}


I shudder to think of writing the same thing on a MathML tree, even a ‘canonical’ one.

That said, the mapping of MathML to JSON is not completely straightforward. MathML has made some choices that work well in a XML environment, but are just really difficult to translate in a JSON structure. For example, what would be the JSON version of “<mi>x</mi><mo>=</mo><mi>y</mi>”? You can’t have two identical keys in the same object, so it can’t be:

{
    "mi": "x",
    "mo": "=",
    "mi": "y"       // second “mi” key! Can’t have that.
}



You can’t rely on the order of the keys to figure out what is the lhs and the rhs on an operator. So, you need something a bit more structured. I think if you work this through, you would indeed land on something close to MATSON. I’d be happy to call it “MathML for JSON” if that helps :)


> Note: although many operators such as '=' are formally binary infix operators, in practice, they are better treated as n-ary operators because they often are used as "a = b = c" or even "a = b <= c = d" and both layout and computation are often easier when all are at the same level.

That is a great point. I’ve been debating this, and had erred on the side of keeping things as simple as possible, but I can see that point.

BTW, I can see this working well for associative operations (addition, multiplication, maybe equality), but how would this work in the example where you have different relational operators?


> I believe several people have tried at various times to formulate a "canonical MathML" format that includes properly structured mrows, one of the canonical Unicode forms, and fixing up pseudo scripts <https://www.w3.org/TR/MathML3/chapter7.html#chars.pseudo-scripts> among some other things. Note that proper mrow nesting depends upon guessing between function call and multiplication if the author didn't indicate which to use (e.g, in MathML, one should use the invisible operators), so the parsing can be complicated. Of course, parsing alone doesn't provide semantics. I believe that Volker adds some attribute that is equivalent to adding something like a "mathrole" to various elements to specify semantics -- some "obvious", some derived from other context, etc.
> 
> I think it would be helpful to develop a tool that anyone can incorporate to produce canonical MathML.

Well, maybe this is yet another attempt at defining this canonical MathML, in an easy to parse JSON format. I don’t want to reinvent the wheel, so if there are things that exist out there, I’d be more than happy to at least start from them (maybe I just need to JSON-ify one of those forms).


> On Mar 23, 2018, at 11:22 AM, Han <laughinghan@gmail.com> wrote:
> 
> Do those "Most math programs" actually output this restricted, "properly" structured MathML? To me, the value in something like MASTON is not what it adds over MathML, it's what it removes. Volker's Speech Rule Engine is a whole project unto itself at least in part because inferring structure from MathML that ChromeVox encounters "in the wild" is so hard, no? That's also why Content MathML and OpenMath exist, because people have seen the need for this information to be part of the data structure and not have to be inferred, before. (I would be curious to hear if Arno considered those?)


Yes, I did consider Content MathML. But as per above, I really wanted something that would be much easier to consume, and for me that meant having a JSON-based format. OpenMath seems interesting, but far from lightweight, and it’s also an XML application.

> 
> (Disclaimer: I specced out and built a toy parser for a JSON format for display math, analogous to TeX and Presentation MathML but in contrast to MASTON and Content MathML and OpenMath, that I called MathSON <https://gist.github.com/laughinghan/4350e4438e6cfc951826>. I haven't decided if it's any good.)


I can see some similarities (convergent evolution!) but also some differences. It seems that MathSON would be suitable as an internal data structure to represent and manipulate a formula while being edited. For example, you still have some string of tokens, while I believe that the relationship between tokens (be it multiplication, function application or what have you) should be resolved by the producer of the data structure, so that the consumers don’t have to ‘guess’. Frankly, I don’t know if the same data structure can be used for editing and for semantic representation. In particular you would have to deal with the tricky problem of representing selection range that cross semantic structures. Dunno, maybe there’s a way to bridge that gap… Still would be nice if MathQuill could at least output to MATSON :)

Best,
Arno.

Received on Saturday, 24 March 2018 04:41:14 UTC