Re: Exploring new vocabularies for HTML from James Graham on 2008-03-31 (public-html@w3.org from March 2008)

From: James Graham <jg307@cam.ac.uk>
Date: Mon, 31 Mar 2008 11:52:48 +0100
To: David Carlisle <davidc@nag.co.uk>
CC: public-html@w3.org, www-math@w3.org
Message-ID: <47F0C280.4050401@cam.ac.uk>

David Carlisle wrote:
>> I'm really uncertain why you think that running an HTML parser to 
>> construct an in-memory representation of the HTML in the same in memory 
>> format as that used for XML is the wrong way to import HTML content into 
>> an application that currently imports only XML.
> 
> The concern is importing mathml content.

If it's MathML-in-text/html, you will need a HTML 5 parser. If the product 
doesn't have one built in you could use a html-XHTML converter based on a HTML5 
parser and XML serializer, as Henri previously pointed out.

>> <wikimath>
> as I said before I have no objection to wiki syntax (I think it's a
> good thing) but I think it should be restricted to wikis.
> 
> Not everyone needs mathml, if you are just going to write x^2 + 1 in
> html you can now and may in the future, just go
> x<sup>2</sup> + 1

How does that address my concern about the difficulties of authoring a treatise 
on, say, the Maxwell equations in a text editor. As a point of comparison for 
those unfamiliar with just how verbose MathML is, I tried using Itex2MML to 
convert a TeX representation of one of these equations (in integral form) to 
MathML. The result may not be entirely idiomatic MathML but it gives an idea of 
the complexity:

The IteX:

\[ \oint_\text{loop} \mathbf{H} \cdot {d\mathbf{l}} = I_\text{free} + 
\int_\text{surface} \frac{\partial \mathbf{D}}{\partial t} \cdot d\mathbf{s} \]

The equivalent MathML:

<math xmlns='http://www.w3.org/1998/Math/MathML' display='block'>
   <msub>
     <mo>&conint;</mo>
     <mtext>loop</mtext>
   </msub>
   <mstyle fontweight="bold">
      <mrow>
        <mi>H</mi>
      </mrow>
   </mstyle>
   <mo>&sdot;</mo>
   <mrow>
       <mi>d</mi>
     <mstyle fontweight="bold">
        <mrow>
           <mi>l</mi>
        </mrow>
     </mstyle>
   </mrow>
   <mo>=</mo>
   <msub>
     <mi>I</mi>
     <mrow>
       <mtext>free</mtext>
     </mrow>
   </msub>
   <mo>+</mo>
   <msub>
     <mo>&Integral;</mo>
     <mtext>surface</mtext>
   </msub>
   <mfrac>
     <mrow>
        <mo>&PartialD;</mo>
        <mstyle fontweight="bold">
          <mrow>
            <mi>D</mi>
          </mrow>
        </mstyle>
     </mrow>
     <mrow>
       <mo>&PartialD;</mo>
       <mi>t</mi>
     </mrow>
   </mfrac>
   <mo>&sdot;</mo>
   <mi>d</mi>
   <mstyle fontweight="bold">
     <mrow>
       <mi>s</mi>
     </mrow></mstyle>
</math>

> But to simultaneously try to introduce the benefits of using a common
> mathematical markup language, and to remove the necessity of using any
> markup at all just seems completely broken to me.

As far as I can tell, the net effect of using a markup language for serialising 
maths is negative, since it adds an unmanageable amount of verbosity. There are 
certainly advantages to the in memory representation being tree-like but that 
can be achieved without sacrificing any notion of a human-readable source.

-- 
"Eternity's a terrible thought. I mean, where's it all going to end?"
  -- Tom Stoppard, Rosencrantz and Guildenstern are Dead

Received on Monday, 31 March 2008 10:54:08 UTC