Re: Exploring new vocabularies for HTML

On Mar 31, 2008, at 11:44, David Carlisle wrote:
>> HTML5 parsers are being developed
>
> An XML application might switch parsers, but is an XSL:FO or  
> docboook or
> DAISY or ...application going to want to switch in an html parser?

Those would continue to use XML parsers and, therefore, the XML  
serialization of MathML.

If you want to take a formula from text/html and put it into DocBook,  
*in the general case* you will have to run a text/html to XML  
converter (HTML5 parser connected to an XML serializer) *anyway*. It  
makes no sense to annoy HTML authors in general with special cases in  
order to make it seem like a text/html to XML converter were not  
needed when reusing text/html stuff in XML.

The same thing applies for reusing HTML today as XHTML. It will also  
apply to reusing SVG-in-text/html in XML.

>> XML 1.0 uses Draconian error handling. Non-Draconian error handling  
>> is
>> a salient feature of text/html. One missing end tag somewhere, and  
>> the
>> syntax is no longer XML.
>
> Yes, use in text/html implies some things but it doesn't imply that
> 1+2+3 gets parsed as three elements, and it doesn't imply that a
> fraction with three children gets silently fixed or corrupted to only
> having two children and being displayed as a normal fraction with no
> indication that anything is wrong.

Those are a matter of definition and not necessarily parts of the  
essence of text/html (like non-Draconian error handling is part of the  
very essence of text/html particularly when contrasted with XML).

I'm very skeptical about MathML tag inference, but I have no illusions  
that MathML-in-text/html or SVG-in-text/html could be reused in XML  
using source copy&paste in the general case without reserializer  
software in between. (If all goes well, I intend to write the needed  
software for that step, so I'm not worried about whether such software  
will exist.)

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Monday, 31 March 2008 08:59:51 UTC