bvar, codition, and domainofapplication from Andreas Strotmann on 2003-05-13 (www-math@w3.org from May 2003)

From: Andreas Strotmann <Strotmann@rrz.uni-koeln.de>
Date: Tue, 13 May 2003 19:11:36 +0200
To: Andreas Strotmann <Strotmann@rrz.uni-koeln.de>, www-math@w3.org
Message-ID: <3EC12748.4010308@rrz.uni-koeln.de>
Hello,

even if these comments may be a bit too late, I feel it would be useful 
to supplement the recommendations I made during the last few days with 
this much more fundamental observation on the use of the main 
MathML-Content qualifiers, as an explanation of the rationale behind 
many of my recommendations.

I'm sorry that this turned out to be such a long comment, but the matter 
feels important to me -- important to get it right in the second 
edition, that is.  I hope this helps!

This of course represents my personal analysis, and though I'm pretty 
confident, I'm not 100% sure I've got it right.  Comments are therefore 
more than welcome.

  -- Andreas

============= The comments =======================

  There appears to be some confusion still about how to use the new 
domainofapplication element.  In some places it is clearly marked as a 
qualifier that generalizes the interval qualifier to general sets, but 
in some places (such as in this appendix) it appears to be treated as a 
regular constructor, thus generalizing the interval *constructor*, which 
then brings up what its definition might be.

  It is evident to me that in the minds of those who came up with this 
important addition to MathML, this element was meant to be used 
exclusively as a qualifier (and emphatically not as a constructor) to 
generalize the interval *qualifier* (and in my stated opinion in fact to 
supercede this dangerous dual use of the interval element), and to 
factor out the use of the condition qualifier that allows it to contain 
a set, not just a predicate.

  In the comments that I have been submitting during the past week or 
so, I have therefore assumed that the intention implicit in the choice 
of qualifiers available for general use is as follows:

     bvar - there may be zero(!) or more of these anywhere; they denote
            bound variables in a particular order.

     uplimit, lowlimit, interval, and domainofapplication - these all
            represent sets that restrict the domain of a variable. Thus,
            only the limits (one or both) *or* interval *or*
            domainofapplication may appear in a sensible apply.

     condition - this one is meant to represent a predicate over the
            bound variables of the surrounding element which acts as a
            filter defining the set of combinations of bvar values
            that the operator ranges over.

  Since domainofapplication is new, condition, interval, and the two 
limits covered its use in earlier versions, which leads to a need to 
deprecate the 'double entendre' usage of these older elements that are 
more clearly represented by domainofapplication.  I therefore recommend 
deprecating the following in order to clean up the spec:

      - the use of a condition qualifier that contains the representation
        of a set rather than a predicate -- markup that does this should
        simply replace the condition with the domainofapplication
        qualifier.

      - the interval qualifier (but not the constructor) -- markup that
        uses the interval qualifier should replace it with a
        domainofapplication qualifier that contains an interval
        constructor element with the same content as the original
        interval qualifier instead.

        (Incidentally, this deprecation also makes it easier again to
        understand how an interval *constructor* would need to be
        allowed to contain a single child element, a condition, because
        doing the same thing with an interval *qualifier* seemed rather
        spurious because condition should be used instead.)

With these changes in place, we have a clean separation between the uses 
of the condition and the domainofapplication qualifiers, as follows:

   -  the condition contains a predicate over all the bvars, and may in
      principle be used with an empty set of bvars (in which case it
      becomes a predicate over the free variables).  The predicate is
      in effect the characteristic predicate for a domainofapplication.

   -  the domainofapplication qualifier, by contrast, contains a set.
      In the current definition, there are two cases that need to be
      distinguished:

      a) the signature   op: (bvar,dom-of-app,any) -> any
      b) the signature   op: (dom-of-app,function) -> any

      Of these, a) means the same as

         op(bvar, condition(bvar \in dom-of-app), arg)

      while b) is equivalent to

         op(newvars, dom-of-app, apply(function, newvars))

      Note that in the latter case, the domain of application may
      actually be a cartesian product in order to assign ranges to
      several arguments of the argument function simultaneously.

      This is currently not true in a), but it could be specified
      in the MathML spec that domainofapplication should contain a
      matching-arity cartesian product in the case of multiple
      bvars with one domainofapplication (although the fact that it
      should contain a cartesian product semantically does not mean
      that it would necessarily contain one syntactically -- but if
      it does, its meaning is clear).

      Note also that a) means the same as

        op(dom-of-app,lambda(bvar,arg))

      while  a signature of  op(bvars, condition, arg) is equivalent to

        op(
          dom-of-app(set(bvars,condition,cartesianproduct(bvars)),
          lambda(bvars,arg)
         )

      and a signature like   op(bvars,arg)   means the same as

         op(lambda(bvars,arg))

For the signatures in appendix C (and the corresponding text in the main 
body), all of this translates to a few general rules:

  - lowlimit, uplimit, (deprecated interval), and domainofapplication
    belong in one and the same group of qualifiers, the most general
    of which is domainofapplication.  They are characterized by not
    specifying the names of the variables that they constrain.

  - the second group of qualifiers is the condition qualifier in its
    reduced definition that requires a predicate argument and disallows
    a set argument.  It is characterized by specifically naming the
    variables it constrains.

  - thus, only one of these groups of qualifiers may be used in an
    element, but not both.

  - in cases where operators usually take bvar qualifiers, as in
    int, diff, product, sum, exists, forall, lack of bvars on first
    sight would automatically mean that the single argument must be a
    function, as in apply(int,f) and in apply(diff,f).
    (Incidentally, this rule does not apply to the lambda constructor
    since it does not have an operator.)

    But in the case of n-ary constructors, this is problematic (this
    includes the set and matrix constructors and the min and max
    operators, all of which have versions without any qualifiers
    with an n-ary regular function interpretation, and a version with
    qualifiers interpreted like int and sum or lambda.)

    The problem is that in the current MathML spec, operators like min
    and max (and there are plenty of those) are implicitly 'lifted' to
    their 'big' version (like \wedge -> \bigwedge , i.e. or -> exists
    in a well known notation, or max -> sup):

      'small' signature (A*) -> B

       'big' signature  (function) -> B
                        (domainofapplication, function) -> B
                        (bvar,A) -> B
                        (bvar,domainofapplication,A) -> B
                        (bvar,condition,A) -> B

    In the future, a general-purpose piece of markup for this lifting
    could be useful, so that one could say, for example

        apply(big(max),my_real_fn)

    instead of

        apply(max,my_real_fn)

    which clearly doesn't do the same thing.

    Fortunately, by supplying an empty domainofapplication, this
    semantic lifting can be implemented in MathML in a systematic way
    without introducing the 'big' operator:

        apply(or,domainofapplication(),my_predicate_fn)

     can be defined consistently in MathML to have the required meaning.
     In other words, we can reconcile the 'big' and 'small' signatures
     into a single one by providing the following combined signature:

       'combi' signature  (A*) -> B
                          (domainofapplication,function) -> B
                          (bvar,A) -> B
                          (bvar,domainofapplication,A) -> B
                          (bvar,condition,A) -> B

       Replacing the 'function' with its signature, (A -> B), we get

       'combi' signature  (A*) -> B
                          (domainofapplication,(A->B)) -> B
                          (bvar,A) -> B
                          (bvar,domainofapplication,A) -> B
                          (bvar,condition,A) -> B

      which is unfortunately a bit different from the

          'big' signature ((A->B)) -> B
                          (domainofapplication,(A->B)) -> B
                          (bvar,A) -> B
                          (bvar,domainofapplication,A) -> B
                          (bvar,condition,A) -> B

       From the perspective of automatic, semantically correct processing
       of MathML-Content trees, the crucial question is how to reliably
       distinguish these two cases when all you have is a user-defined
       symbol as an operator (say).  Which of these classes does it
       belong to?  How do you interpret  op(a)  compared with
       op(bvar(x),b)  safely when you cannot know the signature?

       The practical solution is to always assume the 'combi' signature.
       This means that apply(product,a) is interpreted the same as
       apply(plus,a), and that  apply(plus,bvar(x),f)  is interpreted the
       same as apply(product,bvar(x),f), etc.

       This means that the indefinite integration and differentiation
       operators are the only ones that violate this pattern; user-
       defined operators of this sort should always be used with a
       possibly empty domainofapplication qualifier when they want to
       express application to a function (as in 'prime'):  the completely
       qualifier-free use of my_diff_op as in apply(my_diff_op,f) does
       *not* encode  f^\myprime.  Instead, I would have to use
       apply(my_diff_op,domainofapplication(),f) to get the desired
       meaning, unless one is certain that one never would want to
       use it with a bvar qualifier.  (Note that it is not necessary
       to change the signatures of the existing MathML operators, because
       for these, the signatures do exist.)

       This also means that the quantifiers, and the product and sum
       operators *do* have a reasonable meaning when used without any
       qualifiers, and can be assigned the appropriate signature copied
       from the appropriate logical connectives or from times and plus,
       resp.

       Conversely, it means that all n-ary operators have a reasonable
       meaning when used with these qualifiers, and their signatures
       in appendix C should reflect that fact.

       Last not least, it means that apply can be given a general
       signature, too:

         apply:   ((any*->any), any*)                       -> any
                  (((any->any)->any),dom-of-app,(any->any)) -> any
                  (((any->any)->any),bvar,any)              -> any
                  (((any->any)->any),bvar,dom-of-app,any)   -> any
                  (((any->any)->any),bvar,condition,any)    -> any

        where

          dom-of-app:: domainofapplication|uplimit|lowlimit
                          |lowlimit,uplimit

        (leaving out the questionable interval qualifier).

        Still, it might be useful to provide a type attribute
        in MathML to signal that a csymbol is meant to follow the
        pattern of the int and diff operators rather that of the
        max and min operators, just to be on the safe side.
Received on Tuesday, 13 May 2003 13:11:40 UTC