Re: Formal query about WG role and MathML-FAQ from juanrgonzaleza@canonicalscience.com on 2006-03-14 (www-math@w3.org from March 2006)

From: <juanrgonzaleza@canonicalscience.com>
Date: Tue, 14 Mar 2006 08:45:07 -0800 (PST)
To: <www-math@w3.org>
Message-ID: <3148.217.124.88.157.1142354707.squirrel@webmail.canonicalscience.com>
David Carlisle said:
>
>> I find really interesting that XPath language developed by w3C has NOT
>> a XML syntax.
>
> yes there are times when a non xml syntax is good,

Effectively, this is reason that I am also valuating some non-XML syntax.

> but a syntax that
> uses mixed markup, some XML and some characters is rather hard to handle
> as you can neither pass the entire text node over to an external parser
> (as you would with embedded tex or xpath) nor get the parse tree from
> the xml parse of the document, as you would from an XML syntax such as
> openmath or mathml.
>

1) I think that machine may adapt to human instead of inverse. Therefore,
i prefer to complicate programers life (including myself) before authors
one.

2) The difficulty may be rather relative. It is more easy parser one or
two non-XML tags on an XML document that parser an entire non-XML notation
such as IteX or ASCIIMath (One would see ASCIIMath javascript!).

For example the (verbose but not very complex) template for transforming
from ^ to <sup/> may be some like

<xsl:template name="ReplaceLaTex">
	<xsl:param name="string"/>
	<xsl:param name="LaTex"/>
	<xsl:choose>
	<xsl:when test="contains($string,$LaTex)">
		<textnode>
<xsl:value-of select="substring-before($string,$LaTex)"/>
</textnode>
<sup/>
<textnode>
<xsl:value-of select="substring-after($string,$LaTex)"/>
</textnode>
		</xsl:when>
		<xsl:otherwise>
			<xsl:value-of select="$string"/>
		</xsl:otherwise>
	</xsl:choose>
</xsl:template>

<xsl:template match="CanonMath">
	<xsl:call-template name="ReplaceLaTex">
		<xsl:with-param name="string" select="text()"/>
		<xsl:with-param name="LaTex" select="'\rlhar'"/>
	</xsl:call-template>

or similar (depending of final input syntax we choose). The template for
braces may be more complex due to xml well-formed character of XSLT, but I
think that still could be done a via subsequent double substitution. In
any case, we could use special grouping tag I already suggested and <sup/>
instead of ^; this eliminate the need for above templates.

About mixed content markup, the case (see also the XML tree figures i
added to the website)

<CanonMath>
a+b<fraction/>2
</CanonMath>

can be transformed to pure

<CanonMath>
<textnode>a+b</textnode>
<fraction/>
<textnode>2</textnode>
</CanonMath>

via a template as simple as

<xsl:template match="text()">
<textnode><xsl:value-of select="."/></textnode>
<xsl:apply-templates/>
</xsl:template>

The template for transforming some as “+” into <mo>+</mo> is still more
simple, whereas manipulation of some like

<CanonMath>
<textnode>a+b</textnode>
<fraction/>
<textnode>2</textnode>
</CanonMath>

into some like

<CanonMath>
<fraction/>
<textnode>a+b</textnode>
<textnode>2</textnode>
</CanonMath>

can be achieved via XPath axes. You know very well that this content-like
syntax can be "easily" parsed to presentation one.

Et cetera

Moreover, there are situations where ASCIIMath or IteX or another
TeX/LateX syntaxes are not sufficient. If I remember correctly, the NAG
has had many difficulties in translation of TeX/LateX documents to
presentation MathML (specially with hand-written LaTeX mathematics).

We would no find that kind of problems from a more "rational" input.

> Your proposal appears to be essentially a variant of content mathml with
> a more infix syntax and allowing more operators with different
> presentation but the same semantics.

It could be seen that way. Many input syntax I know are just for
presentation part of MathML. Instead of searching two input syntaxes
(which was my first problem), I think that one single syntax could be
transformed to presentation or content when needed. I discussed that in
the website.

> It is also important to address the issues that presentation mathml aims
> to address, namely the ability to express the layout of mathematical
> expressions _without_ requiring the specification of the semantics
> (either because the semantics are unknown or too hard to express
> formally (in a given amount of time) or there are no semantics, for
> example educational examples of incorrect notation.  I can write
> <msup><mi>H</mi><mn>2</mn></msup> without having to specify anywhere
> what cohomology is.

Hum... the lack of semantic content on presentation MathML is the basis of
main problems around that technology. E.g. difficulties for accessibility,
which is one of main pillars of w3c.

Moreover, I think that you are failing to understand the whole thing. I am
proposing a semantic oriented input markup but the semantic part is
optative.

Take the case of powers and superindexes as illustration

A theoretical chemist writing an research article could write

<CanonMath>
E = mc<power/>2
</CanonMath>

such one is add semantic content to the markup, but an undergraduate
student doing homework could write

<CanonMath>
E = mc<sup/>2
</CanonMath>

Which is just as MathML <msup></msup>

Your above case I could just write as

<CanonMath>H<sup/>2</CanonMath>

and then it is unknown that I am writing; Hamiltonian? quantum operator?
The related matrix? Enthalpy? Entropy? Another?. And what about the "2"?
is a power? the second component of some vector H? etc. In cases that
semantics is unknown or very difficult to write one can simply write
"presentational" markup I have illustrated above.

What is more natural to use three syntaxes: presentation MathML, Content
MathML, and Maple one (for computation), or use a single unified syntax
could be used in three different situations?

Moreover, according to MathML WG, the MathML language is designed as a
kind of postscript low-level language, and I am working a high-level
language can be authored by hand. Therefore i cannot really understand
some of criticism/replies i am receiving.

>
> David

Juan R.

Center for CANONICAL |SCIENCE)
Received on Tuesday, 14 March 2006 16:45:33 UTC