HTML OK in <mtext> but not in <annotation-xml>, etc. [was: HTML5 with MathML has problem with numerical attrubute values]

Joe Java <cop3252@yahoo.com>, 2010-06-11 23:02 -0700:

> Thank you for the changes you made to the validator.nu backend at
> http://qa-dev.w3.org:8888/ 
> 
> I think everything is working correctly with the validator now.
> 
> I got all the pages to compile OK except for
> https://eyeasme.com/Joe/MathML/HTML5/basics.html
> and
> https://eyeasme.com/Joe/MathML/HTML5/extras.html
> 
> All the other pages required changes in them to compile cleanly.
> 
> The two pages that do not validate use the "semanatic" tag in a 
> non-validating manner.  The validator correctly points out the 
> incorrect use of the tag and a careful reading of the MathML specs 
> confirms the validators results.

I think you can get those pages to validate if you replace the
instances of <semantics><annotation-xml> with just <mtext>; e.g.:

  <mtext>
    <img xmlns="http://www.w3.org/1999/xhtml"
    src="http://www.mozilla.org/images/logo-star.gif" alt="star logo" />
  </mtext>

The document-conformance rules for where HTML elements are allowed
to occur in MathML-in-HTML content are currently not specified;
but they will eventually be added to the HTML5 spec. I think the
rules will at a minimum allow HTML “phrasing content” elements
within the MathML <mtext> element, so that is what I have
experimentally implemented on  http://qa-dev.w3.org:8888/ for now.

It is possible that the rules may end up being a bit more liberal;
for example, they might also allow HTML phrasing content within the
MathML <mi>, <mn>, <mo> and <ms> elements. If you have some
thoughts on what the rules should be, I encourage you to post them
as comments to this bug:

  http://www.w3.org/Bugs/Public/show_bug.cgi?id=9859

Note that a big problem with using the <annotation-xml> element in
your .html (text/html) elements is this: The HTML5 parsing
algorithm (which the validator.nu backend uses) currently does not
not expect to find HTML content within the <annotation-xml>
element.  It instead expects only MathML or SVG content with
<annotation-xml>.

When a conformant HTML5 parser finds HTML content within an
<annotation-xml> element (e.g., in the places in your existing
document where there are <img> elements), that basically puts the
parser into an error state that it doesn’t recover from well.                                                                                                              

That is arguably a bug/deficiency in the HTML spec that should be
corrected. It’s not one I personally feel strongly about either
way, but I think we need some kind of definitive resolution on it,
so I’ve a gone ahead and filed a spec for it:

  http://www.w3.org/Bugs/Public/show_bug.cgi?id=9887

> I know of no errors in the validator at http://qa-dev.w3.org:8888/ 
> with regard to HTML5 and MathML.
> 
> The production W3C validator should be updated with these changes.

It will probably be a few weeks yet before I push the changes from 
http://qa-dev.w3.org:8888/ to the production HTML5 facet of the
W3C validator. When I do, I’ll definitely try to remember to let
you know.

In the mean time, if you have more questions or suggestions
related to how the validator should handle MathML content, please
do let me know.

  --Mike

-- 
Michael(tm) Smith
http://people.w3.org/mike

Received on Monday, 14 June 2010 01:42:57 UTC