Content MathML editing language: Binary => n-ary syntax

Hi all,

I have been developing a plain-text input language for the content 
MathML in CellML documents (so that it can be efficiently edited by 
users). Although CellML is a declarative language rather than a 
procedural one, much of the expression syntax from languages like MATLAB 
and C-like languages can be re-used, and is likely to be familiar to 
many users. I have therefore tried to make my input language similar to 
these languages, where possible (for example, I have in-order operators 
like +, -, *, and / for plus, minus, times, divide, and a pre-order 
syntax, e.g. sin(x),  for other operators). Please ask off-list for my 
bison grammar (a work in progress) if you would like to see it (I 
haven't checked it in to the public Subversion yet). Any opinions or 
suggestions on the overall structure of the language would be welcome.

I am also seeking opinions on the most intuitive way to deal with the 
conversion between binary in-order operators like +. For example, if, 
within the input language, you have
x = a + b + c + d + e + f
a naive parser might create MathML like...

<apply><eq/>
  <ci>x</ci>
  <apply><plus/>
    <apply><plus/>
      <apply><plus/>
        <apply><plus/>
          <apply><plus/>
            <ci>a</ci>
            <ci>b</ci>
          </apply>
          <ci>c</ci>
        </apply>
        <ci>d</ci>
      </apply>
      <ci>e</ci>
    </apply>
    <ci>f</ci>
  </apply>
</apply>

A slightly more complex parser might instead produce:
<apply><eq/>
  <ci>x</ci>
  <apply><plus/>
    <ci>a</ci>
    <ci>b</ci>
    <ci>c</ci>
    <ci>d</ci>
    <ci>e</ci>
    <ci>f</ci>
  </apply>
</apply>

I would be interested in opinions on whether you feel this automatic 
translation from multiple binary in-order operators to a single 
pre-order operation makes sense (note: all CellML tools available now 
work with real numbers only, but future work could allow it to be 
extended to support other mathematical constructs. Using a definitionURL 
on an operator is technically valid CellML, but no tools can do anything 
with this either).

The issue is complicated by what to do with bracketed expressions(which 
I currently allow, to override ambiguity). For example, a user could 
enter...
x = (((((a + b) + c) + d) + e) + f)
I would be interested to know if you believe that the first content 
MathML encoding or the second is more appropriate.

Best regards,
Andrew Miller

Received on Sunday, 19 November 2006 23:04:08 UTC