Re: Exploring new vocabularies for HTML

On Mon, Mar 31, 2008 at 5:43 PM, Ian Hickson <> wrote:

> <snip>
> I would imagine that we would go to some lengths to allow "Classic MathML"
> to be pasted into HTML5 and have it work, with a few caveats:
>  * no prefixes on the tag names
>  * only <mspace>, <malignmark>, <maligngroup>, <mglyph>, <none>, and
>   <mprescripts> use the empty element syntax
>  * no DTD internal subset

I think this statement will help to alleviate the great concern the MathML
WG had about MathML in HTML5.  Getting a clear statement like this will
hopefully allow the discussion to be more focused on the open issues.

One the consequences of the above rule is that content MathML will not be
part of HTML5.  Speaking for myself, I can live with that as that has been
the case for Firefox for years and fits with the idea that users should
supply style sheets or other means to specify how to present the content.

One area that has been the focus of much discussion is semantics, et. all.
I strongly recommend those tags be included.  There have been theoretical
arguments that it allows data to be out of sync, but practice has shown that
this is a minor concern at best.

As another data point, Mozilla's implementation of MathML initially left off
semantics -- this caused most MathML to fail in Mozilla because most MathML
is generated by program, not by hand and most programs use that.  Its
omission was an oversight, due to semantics not be listed in the
presentation chapter.  It was added in and now Firefox happily accepts

Supporting semantics also means that if content MathML is served and
transformed via XSL, then the XSL can stick the content into the semantics
element so that the information is not lost.  The cost of supporting
semantics is minimal, and I hope you consider it part of "Classic MathML" as
it occurs in the majority presentation MathML on the web.

> Jacques' comments above have led me to consider a different approach to
> making the MathML-in-text/html syntax easier to write.
> It seems like the most unambiguous option is to focus on making end tags
> optional. This basically consists of defining when an end tag is implied.
> </mn>, </mo>, </mi> could be implied whenever a MathML start tag other
> than <mglyph> or <malignmark> is seen while the appropriate element is on
> the stack of open elements.
> </mfrac> could be implied when a start tag is seen when the element
> already has two children, and similarly with <mroot>, <msub>, etc.
> Almost any MathML close tag could be implied when an <mtr> start tag is
> seen when there's an <mtable> element on the stack but the current element
> isn't an <mtable>.
> So e.g. instead of:
> <math xmlns="">
>  <mi>x</mi> <mo>=</mo>
>  <mfrac>
>  <mrow>
>   <mo>-</mo> <mi>b</mi> <mo>&PlusMinus;</mo>
>   <msqrt>
>    <msup>
>     <mi>b</mi> <mn>2</mn>
>    </msup>
>    <mo>-</mo> <mn>4</mn> <mo>&InvisibleTimes;</mo> <mi>a</mi>
> <mo>&InvisibleTimes;</mo> <mi>c</mi>
>   </msqrt>
>  </mrow>
>  <mrow>
>   <mn>2</mn> <mo>&InvisibleTimes;</mo> <mi>a</mi>
>  </mrow>
>  </mfrac>
> </math>
> ...we could have:
> <math>
>  <mi>x <mo>=
>  <mfrac>
>  <mrow>
>   <mo>- <mi>b <mo>&PlusMinus;
>   <msqrt>
>    <msup> <mi>b <mn>2
>    <mo>- <mn>4 <mo>&InvisibleTimes; <mi>a <mo>&InvisibleTimes; <mi>c
>   </msqrt>
>  </mrow>
>  <mrow>
>   <mn>2 <mo>&InvisibleTimes; <mi>a
> </math>
If you need any more examples of why parsing math is harder than it might
seem at first blush, let me know.  I know of probably a dozen off the top of
my head and could probably double that without a whole lot of work.

One thing to note in your above example is that you have used two named
entities.  I believe that these have been ruled out for HTML5.  The lack of
such named entities will make it much tougher to hand author math (in any
form) in HTML5.  I use both WYSIWYG and smart text editors to create/edit
MathML, so this is not an issue for me.  However, for those of you who
insist on hand authoring, you should stop and think about how limiting this
will be and whether hand authoring is really going to be very useful to you.

One unfortunate thing about the discussion on hand authoring is that it has
mostly been devoted of facts.  Some *facts *on percentages of hand-authored
vs machine-authored HTML should be part of a reasoned discussion, but sadly
neither side has produced any such facts.  I hope someone can produce those
facts and that if they support one side or the other, the side whose
position they don't support has the integrity to acknowledge their position
is not based on usage.

Neil Soiffer
Senior Scientist
Design Science, Inc.
~ Makers of Equation Editor, MathType, MathPlayer and MathFlow ~

Received on Tuesday, 1 April 2008 04:58:59 UTC