Re: Subscripts in Content MathML from General on 2005-10-06 (www-math@w3.org from October 2005)

From: General <maths@mathsonly.com>
Date: Thu, 6 Oct 2005 06:20:48 -0700 (PDT)
To: www-math@w3.org
Message-Id: <20051006062049.9F88808E@dm22.mta.everyone.net>
Dear David, Paul,

The problem with <ci>a_1</ci> is that it would be treated by our algebra engine as an object with name "a_1", and of course this is not strictly true. Rather, the object name is "a", but we are further *qualifying* this symbol identifier with a subscript. Furthermore, all standard XSLT sheets will convert <ci>a_1</ci> into <mi>a_1</mi> rather than <msub><mi>a</mi><mn>1</mn></msub> which is of course preferred.

As I said in my original e-mail, the DTD permits the following combination:

<ci><msub><ci>a</ci><ci>i</ci></msub></ci>    (*)

I've also checked with the following Schema documents, which collectively confirm that this is valid:

http://www.w3.org/Math/XMLSchema/mathml2/presentation/scripts.xml
http://www.w3.org/Math/XMLSchema/mathml2/common/math.xml

Following through, we've got <msub> which can have children of Schema type "Presentation-expr.class", which is declared to be either "ContentExpr" OR "PresentationExpr"... So, it is valid to embed content within presentation, and also presentation within content.

Now, if a processing engine were to evaluate a sum (for instance), then we would get output like this:

<ci><msub><ci>a</ci><cn>1</cn></msub></ci>
<ci><msub><ci>a</ci><cn>2</cn></msub></ci>
...
<ci><msub><ci>a</ci><ci>n</ci></msub></ci>

for a sum of a_i between i=1 and i=n. If the processor can be designed to ignore presentation elements (or any element it doesn't recognise) in the DOM, but still evaluate their children, then it is a simple matter that any <ci>i</ci> sequences should be replaced by the appropriate number sequence (e.g. <cn>1</cn>).

Isn't this be a good enough solution, both in terms of it being valid MathML, but also being good semantics notationally?

The only problem is to design a content-to-presentation XSLT sheet which will transform the content in the above expression while leaving the presentation alone. In the case of <ci> containing an <msub>, we also would want to remove the <ci> while retaining the <msub>, and evaluate the contents of the <msub>. Thus (*) becomes:

<msub><mi>a</mi><mi>i</mi></msub>

Since the <ci> has been stripped (because it has presentation children) and the <ci>a</ci> and <mi>i</mi> have been transformed to <mi>a</mi> and <mi>i</mi> respectively.



>How do you encode content-mathml, btw ?

Sorry, I don't quite understand what you mean by "encode" in this context - "encode" is a little vague, could you elaborate?



>aside of the encoding problem of a_i you have such problems
>as encoding the ellipsis a_1, ..., a_k,

True, the problem of the ellipsis did cause some problems for a short time. What we decided to do was use <csymbol>...</symbol> as our ellipsis, and to declare a definitionURL which points to an as-yet undecided address. The idea was that an ellipsis is as much content as it is presentation - an ellipsis represents something elided from the content upon which operations should still occur. Suppose we take an approximation to the sine series expansion (up to powers of 'n'):

sin[x] = x - x^3/3! + x^5/5! + ... + x^n/n!

Now suppose the algebraic engine is asked to differentiate:

cos[x] = 1 - x^2/2! + x^4/4! + ... + x^(n-1)/(n-1)!

The engine has left the <csymbol>...</csymbol> alone, as it knows that the derivative of <csymbol>...</csymbol> is most likely still another <csymbol>...</csymbol> (it ignores the case where it *might* be 0).

Take another example; suppose we multiply the sin[x] series above by 0. Clearly anything multiplied by 0 will be 0, so any ellipses which occur will also evaluate to 0, and therefore:

0*sin[x] = 0*x - 0*x^3/3! + 0*x^5/5! + 0*(...) + 0*x^n/n!

In this case 0*(...) = 0, and this property is inherent of the ... symbol - when multiplied by 0, it is always 0, so the engine can replace it with <cn>0</cn> as it pleases.


Actually, this whole discussion of extending MathML and making the most out of the language has prompted us to find a few more 'holes' to discuss; I'll be posting them shortly.

Look forward to your opinions,
Charles Lyons.

_____________________________________________________________
http://www.easypost.com Anti-Virus & Anti-Spam Web Mail thats hotter than hot
Received on Thursday, 6 October 2005 13:21:20 UTC