Re: MathML entities don't degrade gracefully

On Apr 25, 2008, at 13:59 , David Carlisle wrote:

>> Using the MathML entities in XML requires a doctype, because  
>> otherwise
>> the document   would be ill-formed.
> Yes and no. The HTML5 spec could state that when processing
> application/xhtml+xml documents that the application should
> (effectively) use a catalog that supplies DTD entity definitions for
> the HTML5 entities (it may make sense to do this regardless of whether
> the "html5 entity set" ends up being the html4 names or html4+mathml
> names).

The HTML 5 spec could indeed specify precise what kind of entity  
resolver needs to be supplied to a vanilla XML 1.0 parser when parsing  
application/xhtml+xml without having to fork XML. If we do that, I  
suggest standardizing Gecko's catalog of two special DTDs and the  
particular public IDs that map to these.

> <!DOCTYPE html>
> <html>
> <p>&phi;</p>
> </html>
> or even just
> <html>
> <p>&phi;</p>
> </html>
> is well formed (but not valid) if the parser is using a catalog that  
> says
> (for example) that any document with document element "html" should  
> use
> a dtd that (just) defines some set of html5 entities.

This, on the other hand, would mean forking XML and creating something  
that's almost XML but not quite--thereby making it incompatible with  
deployed browsers and the existing XML toolchain. If we went that  
route, I think we should do it the right way the first time and have  
only one major discontinuity point. In that case, instead of fixing  
one XML design flaw at a time, we should go all the way to "XML5" on  
the first try specifying non-Draconian streamable error handling,  
adding MathML entities as built-in, removing *all* restrictions on  
what characters can appear in a Name and removing DTDs all in the same  

Henri Sivonen

Received on Friday, 25 April 2008 12:22:19 UTC