# Re: proposed extenstions to content MathML

• From: Andreas Strotmann <Strotmann@rrz.uni-koeln.de>
• Date: Tue, 10 May 2005 16:55:13 +0200
• Message-ID: <4280CB51.6000705@rrz.uni-koeln.de>
```RobertM@dessci.com wrote:
> Hello All.
>
> Kyle Siegrist, who created the Virtual Laboratories in Probability and
> Statistics web site <http://www.math.uah.edu/stat/>, recently
> suggested the following extensions to me, as good candidates for a
> MathML 3 update.  Prof. Siegrist writes:
>
>  "Here is what I would love to see added to Content MathML:
>
>    1. Binomial coefficient
>
>    2. Permutation coefficient:  n(n -1)...(n - k + 1), usually
>    rendered P(n, k) or nPk or (n)k.
>
>    3. A probability operator with an optional "given" construction
>    (for conditional probability).  Typical rendering would be
>       P(A, B, ...) (without conditioning) or  P(A, B, ... | C, D, ...)
>    (with conditioning).
>
>    4. An expected value operator with an optional "given" construction
>    (for conditional expected value).  Typical rendering would be E(A,
>    B, ...) (without conditioning) or  E(A, B, ... | C, D, ...) (with
>    conditioning).
...
>   If I had these extensions, I think that I could do just about
>   everything that I wanted without going over to Presentation MathML.
>
>   Items 3 and 4 (with the "given" construction) are really important in
>   probability, statistics, and stochastic processes; conditional
>   probability and expected value are central notions.  Ordinary
>   probability and expected value can be done with the usual function
>   ("apply") construction, but there is no way to do the conditioning
>   without adding Presentation MathML as a kludge.
...
> Anyone want to second these proposals?  Or take issue with them?

I agree that the "given" construction appears to be central to
statistics, a topic that is, indeed, covered in German highschool
classes as an optional topic sometimes.

Like Prof. Siegrist, I do not see immediately how to implement that
concept with MathML-Content qualifiers as they currently stand, but I'm
not sure that I would give up without giving it deeper thought. Here are

First of all, I would try and take a look how existing symbolic math
systems (if any) handle symbolic statistics in general, and this
construct in particular. That might give one an idea of how to do this

Second, I would try to understand what the "given" construct actually
means in this context.  Not being a statistician myself, but having had
to teach the "given" concept to undergraduates once, my impression was
that it is quite a complex concept indeed.  Here is what I understand it
to mean:

- the concept of a statistical variable is actually fundamental, and
quite different from a "normal" variable.  It is my understanding that
such a variable ranges over the class of probability distributions (i.e.
a statistical variable has a distribution as a value).

- P(X,Y,Z) actually is a compound concept, consisting of (X,Y,Z), the
joint probability distribution of the three statistical variables X,Y,
and Z, and the probability measure function, P, which is actually
applied to the joint distribution, not the list of arguments.

- what the "given" construct then does is assign a different "value"
to the statistical variables that are "given" within the scope of the
surrounding parentheses, i.e. when constructing the joint probability
distribution it represents. In other words, a "given" Z gets assigned,
locally, the probability distribution meaning "known to be true" instead
of its original one outside the scope of the parentheses around the
"given" construct.

- it is possible to have Z "given" as false or true.

If this is a correct analysis, then we could come up with a reasonable
suggestion for representing it in MathML-Content:

<apply> <probability/>
<apply>
<jointdistribution/>
<bvar><ci>Z</ci></bvar>
<condition>
<apply> <given/> <ci>Z</ci> <true/> </apply>
</condition>
<ci>X</ci>
<ci>Y</ci>
<ci>Z</ci>
</apply>
</apply>

and render it as P(X,Y|Z).

This representation makes sure that the local reassignment of value to
the statistical variable (aka binding of the variable) is honored in the
representation, which is usually an important consideration when
creating MathML Content.

There are probably still problems with this particular suggestion, but
it might help understand the problems behind your assertion that "given"
can't be done in MathML.

Hope this helps just a little bit,

-- Andreas

>
> --Robert
>
> ------------------------------------------------------------------
> Dr. Robert Miner                                RobertM@dessci.com
> W3C Math Interest Group co-chair                      651-223-2883
> Design Science, Inc.   "How Science Communicates"   www.dessci.com
> ------------------------------------------------------------------
>
>
>
```

Received on Tuesday, 10 May 2005 14:55:34 UTC