Re: Supporting MathML and SVG in text/html, and related topics from Henri Sivonen on 2008-04-16 (www-math@w3.org from April 2008)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Wed, 16 Apr 2008 12:14:50 +0300
To: Paul Libbrecht <paul@activemath.org>
Cc: David Carlisle <davidc@nag.co.uk>, jirka@kosek.cz, whatwg@whatwg.org, public-html@w3.org, www-math@w3.org, www-svg@w3.org
Message-Id: <C8365DB0-6EDA-460D-A39C-57C5F7F853F1@iki.fi>

On Apr 16, 2008, at 10:47, Paul Libbrecht wrote:
> I would like to put a grain of salt here and would love HTML5  
> passionates to answer:
>
> why is the whole HTML5 effort not a movement towards a really  
> enhanced parser instead of trying to redefine fully HTML successors?

text/html has immense network effects both from the deployed base of  
text/html content and the deployed base of software that deals with  
text/html. Failing to plug into this existing network would be  
extremely bad strategy. In fact, the reason why the proportion of Web  
pages that get parsed as XML is negligible is that the XML approach  
totally failed to plug into the existing text/html network effects  
(except for Appendix C which lacks a migration strategy to actual XML  
and amounts to the emperor's new clothes).

> Being an enhanced parser (that would use a lot of context info to be  
> really hand-author supportive) it would define how to parse better  
> an XHTML 3 page, but also MathML and SVG as it does currently... It  
> has the ability to specify very readable encodings of these pages.
>
> It could serve as a model for many other situations where XML  
> parsing is useful but its  strictness bytes some.

Anne has been working on XML5, but being able to parse any well-formed  
stream to the same infoset as an XML 1.0 parser and being able to  
parse existing text/html content in a backwards-compatible way are  
mutually conflicting requirements. Hence, XML5 parsing won't be  
suitable for text/html.

> Currently HTML5 defines at the same time parsing and the model and  
> this is what can cause us to expect that XML is getting weaker. I  
> believe that the whole model-definition work of XML is rich, has  
> many libraries, has empowered a lot of great developments and it is  
> a bad idea to drop it instead of enriching it.

The dominant design of non-browser HTML5 parsing libraries is exposing  
the document tree using an XML parser API. The non-browser HTML5  
libraries, therefore, plug into the network of XML libraries. For  
example, Validator.nu's internals operate on SAX events that look like  
SAX events for an XHTML5 document. This allows Validator.nu to use  
libraries written for XML, such as oNVDL and Saxon.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Wednesday, 16 April 2008 09:16:00 UTC