on variables in MathML from Andreas Strotmann on 1997-09-25 (www-math@w3.org from September 1997)

From: Andreas Strotmann <Strotmann@rrz.Uni-Koeln.DE>
Date: Thu, 25 Sep 1997 20:49:18 +0200
To: www-math@w3.org
Cc: strotmann@rs3.rrz.Uni-Koeln.DE
Message-Id: <9709252049.ZM26945@rs3.rrz.Uni-Koeln.DE>
Hi,

 I like the MathML proposal very much.  I'm particularly glad to note the
clean distinction bteween presentation and content markup observed in the
proposal.  Th following proposals are concerned only with content markup
(excpting the postscriptum which gives a reference to a Unicode font glyph
source).

 I would like to make a point regarding the BVAR element that was
discussed at considerable length during the Dublin OpenMath workshop late
last year.  It was while discussing the representation of an integral that
this point came up, so it is in this context that I would like to
reiterate the point made there.  (Please forgive me if it has been
discussed and perhaps dismissed in the WWW-Math group before.  I did
mention planning to write about this to an active member of the mailing
list, and he didn't warn me off.)

 The MathML draft proposes the following representation for an indefinite
integral:

  <INT>
    <LOWLIMIT> <MI> 0 </MI> </LOWLIMIT>
    <UPLIMIT>  <MI> y </MI>  </UPLIMIT>
    <EXPR>     <MI> x </MI>     </EXPR>
    <BVAR>     <MI> x </MI>     </BVAR>
  </INT>

 Now, consider the (semantic) scope of x and y here.  Clearly, the scope
of the BVAR x is the <EXPR>...</EXPR> sub-expression, but the LOW/UP-LIMIT
sub-expressions are _outside_ its scope. (Try replacing x by y and vice
versa, and then imagine an unchanged \forall y surrounding it way up the
hierarchy).

 According to the draft introduction, a guiding principle of the
representation chosen for the content markup is that "logical braces"
should correspond to sub-expression formation.  This principle appears to
correspond to the "compositionality principle" of formal semantics in
linguistics.

 I would like to argue, as was successfully argued at the OpenMath meeting
mentioned before, that the representation of the integral as proposed in
the draft violates the "logical bracing" criterion -- a criterion that I
believe to be of fundamental importance for any content markup.

 More generally, I would like to argue that _scoping_ is a fundamentally
important criterion for determining "proper" "logical braces".  The MathML
draft recognizes this explicitly for operator scoping; I would like to
propose that this be recognized as well for variable scoping.

 In the case of the INT example, the following two possible solutions
to this problem come to mind:

a)
     <INT>
       <EXPR> <MI> x </MI> <BVAR><MI> x </MI></BVAR> </EXPR>
       <LOWLIMIT> 0 </LOWLIMIT>
       <UPLIMIT> <MI> x </MI> </UPLIMIT>
     </INTEGRAL>

b)
    <EXPR>
       <EXPR>
         <INT/> <MI>x</MI> <BVAR><MI>x</MI></BVAR>
       </EXPR>
       <LOWLIMIT/> 0
       <UPLIMIT/> <MI> x </MI>
    </EXPR>

(Similar comments apply, of course, to the SUM and PRODUCT elements, and
in "advanced" MathML to a host of other operators.)

 Solution a) is equivalent to the OpenMath 1.0 solution adopted at the
aforementioned meeting, except that BVAR is called LAMBDA there for rather
obvious reasons.

 In solution b), INT becomes a normal operator, as do LOWLIMIT and
UPLIMIT, both of which thus become available for arbitrary meaningful
combinations.

 Both solutions share a common fundamental property:

   A BVAR's logical scope is it's immediately surrounding expression;
   i.. the scope of the variable(s) listed in the BVAR subexpression is
the
   sub-exprssion immdiately enclosing the BVAR sub-expression.

 This -- or a similar -- property of the BVAR element I would like to
propose to add to the MathML draft for content markup.  It is my feeling
that it is very important to do so if we're truly talking about "content
markup".  Historical precedent such as lambda calculus or the constructive
proofs of Goedel's theorems show that semantic represenations of maths
profit from a syntactic structure that mirrors the semantic structure, and
from simple, universal, unequivocal scoping rules for both operators and
variables.


"Corollaries"

 A few more suggestions come to mind in this context:

* add <VAR> and <CONST> containers, as the distinction between variables,
  constants, and operators (<FN>) is of fundamental importance.

* allow <BVAR> x <SEP/> y </BVAR>

* the following is often found in German high school text books, I
believe:

    \forall x\in \R : \sin^2 x + \cos^2 x = 1

  Proposal: <EXPR><FORALL/>
                  <BVAR><VAR>x</VAR> <IN/> <CONST><REALS/></CONST></BVAR>
                  <EXPR> ... </EXPR>
            </EXPR>

* allow <BVAR> <MSUB><MI>x</MI> 1</MSUB></BVAR> (subscripted variable)

* use BVAR in the markup for <SET> .. <ST/> .. </SET>, <LIMIT>,
  i.e. everywhere a variable is bound by an operator.


Miscellaneous

 * I'm missing <elements> for named constants: e, pi, i, true, false,
   and for domains: reals, natural and whole numbers, etc.

 * symbols commonly used at the high school level in Germany that I missed
   are those for:
   function composition, Cartesian product, choose operator, quantifiers

 * the prime operator (f') might be represented like this:
   <DIFF> <FN>f</FN></DIFF>

 * the folowing notation is common, and the concept it represents should
   be present in MathML:

     f: R -> R
        x |-->  f(x) := x^2

[In all of these I argue that these are common at the high school level,
and certainly in first-semester college maths, in Germany.  I thus also
propose to eliminate the reference to "American" highschools and replace
it with "high schools in most countries".]

 * And finally, may I suggest that the following be valid MathML:

   <SEMANTICS> <B>Four colors suffice!</B>
              <EXPR> <FORALL/> ... </EXPR>  </SEMANTICS>


Sorry for the long message...

Regards,     Andreas

PS: For chapter 6 of the MathML draft it may be of interest to note that
the Unicode server (www.unicode.org) now has a fairly complete repertoire
of Unicode glyphs online.  Thus, in addition to the Unicode code point it
is now possible to include a link to the corresponding glyph on the
unicode server.  For example, &ltdot; (U+22D6) is located at
http://www.unicode.org/Unicode.charts/Small.Glyphs/22/U+22D6.gif

-- 
Andreas Strotmann       / ~~~~~~ \________________A.Strotmann@Uni-Koeln.DE
Universitaet zu Koeln  /| University of Cologne   \
Regionales Rechenzentrum| Regional Computer Center \
Robert-Koch-Str. 10    /|    Tel: +49-221-478-5524 |\   Home: -221-4200663
D-50931  Koeln        __|__  FAX: +49-221-478-5590 |__________~~~~~~~~~~~~
Received on Thursday, 25 September 1997 14:49:37 UTC