Re: [cellml-discussion] Content MathML editing language: Binary =>n-ary syntax from Andrew Miller on 2006-11-20 (www-math@w3.org from November 2006)

From: Andrew Miller <ak.miller@auckland.ac.nz>
Date: Mon, 20 Nov 2006 17:31:12 +1300
To: "For those interested in contributing to the development of CellML." <cellml-discussion@cellml.org>
Cc: www-math@w3.org
Message-ID: <45612F90.7000905@auckland.ac.nz>
David Nickerson wrote:
> Andrew Miller wrote:
>   
>> Hi all,
>>
>> I have been developing a plain-text input language for the content 
>> MathML in CellML documents (so that it can be efficiently edited by 
>> users). Although CellML is a declarative language rather than a 
>> procedural one, much of the expression syntax from languages like MATLAB 
>> and C-like languages can be re-used, and is likely to be familiar to 
>> many users. I have therefore tried to make my input language similar to 
>> these languages, where possible (for example, I have in-order operators 
>> like +, -, *, and / for plus, minus, times, divide, and a pre-order 
>> syntax, e.g. sin(x),  for other operators). Please ask off-list for my 
>> bison grammar (a work in progress) if you would like to see it (I 
>> haven't checked it in to the public Subversion yet). Any opinions or 
>> suggestions on the overall structure of the language would be welcome.
>>
>> I am also seeking opinions on the most intuitive way to deal with the 
>> conversion between binary in-order operators like +. For example, if, 
>> within the input language, you have
>> x = a + b + c + d + e + f
>> a naive parser might create MathML like...
>>
>> <apply><eq/>
>>   <ci>x</ci>
>>   <apply><plus/>
>>     <apply><plus/>
>>       <apply><plus/>
>>         <apply><plus/>
>>           <apply><plus/>
>>             <ci>a</ci>
>>             <ci>b</ci>
>>           </apply>
>>           <ci>c</ci>
>>         </apply>
>>         <ci>d</ci>
>>       </apply>
>>       <ci>e</ci>
>>     </apply>
>>     <ci>f</ci>
>>   </apply>
>> </apply>
>>
>> A slightly more complex parser might instead produce:
>> <apply><eq/>
>>   <ci>x</ci>
>>   <apply><plus/>
>>     <ci>a</ci>
>>     <ci>b</ci>
>>     <ci>c</ci>
>>     <ci>d</ci>
>>     <ci>e</ci>
>>     <ci>f</ci>
>>   </apply>
>> </apply>
>>
>> I would be interested in opinions on whether you feel this automatic 
>> translation from multiple binary in-order operators to a single 
>> pre-order operation makes sense (note: all CellML tools available now 
>> work with real numbers only, but future work could allow it to be 
>> extended to support other mathematical constructs. Using a definitionURL 
>> on an operator is technically valid CellML, but no tools can do anything 
>> with this either).
>>
>> The issue is complicated by what to do with bracketed expressions(which 
>> I currently allow, to override ambiguity). For example, a user could 
>> enter...
>> x = (((((a + b) + c) + d) + e) + f)
>> I would be interested to know if you believe that the first content 
>> MathML encoding or the second is more appropriate.
>>     
>
> I think if the user has specifically entered the brackets then they 
> should be reflected in the generated MathML.
Could you please elaborate on why you think this? I can think of three 
rationale:
      1) Presentational reasons, i.e. to improve the way the math is 
viewed for the user.
          => Note that content MathML doesn't really guarantee this 
anyway, and there are other semantically 'redundant' brackets, such as 
in a+(b*c) that couldn't be stored in content MathML at all.
      2) You are concerned that it will change the semantics due to 
definitionURL or due to generalising the CellML model to something where 
variables aren't a group on addition/multiplication/division/subtraction?
      3) You feel that users should have that level of control over the 
MathML, even when the semantics don't change?

I personally don't think a content MathML input language should even try 
to address 1. To me, 2 is important, although at the moment, CellML only 
works on real numbers, and it is unlikely that it will ever work on 
anything other than reals, integers, complex numbers and vectors, 
matrices (although some of these don't define all the operators, they do 
still all have binary operators where they are defined, so the 
relationship between the MathML n-ary operators and the binary operators 
remains unchanged).

I am somewhat neutral on rationale 3, because I don't like the idea of 
going into edit mode on equation, not changing anything, but still 
having the form of the MathML get changed, but at the same time, if the 
semantics are guaranteed not to change, I am not convinced that it matters.
>  If you think the second 
> chunk of MathML is more efficient (in some manner),
It is obviously more space-efficient. Most tools would probably find it 
more time-efficient to process as well. Because the CCGS would generate 
identical C-code from both forms (and most other CellML processing tools 
would eliminate such differences before running the model in some 
fashion or another), it wouldn't make the models run faster after the 
initial hit.
>  then it might be 
> nice to prompt the user suggesting the removal of the brackets - i.e., 
> present them with a display of the more efficient syntax and prompt for 
> a simple yes or no to choose whether the more efficient version gets 
> used. (and probably have an option to turn this feature on and off.)
>   
I think it would almost have to be off by default, because it would 
probably get in people's way too much. However, if we did decide to 
allow the user to create semantically identical, but more verbose 
versions of the same thing, we could provide an editor command which 
goes through and makes transformations such as this one.

Best regards,
Andrew
Received on Monday, 20 November 2006 04:31:58 UTC