- From: Paul Libbrecht <paul@activemath.org>
- Date: Wed, 16 Apr 2008 11:58:11 +0200
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: David Carlisle <davidc@nag.co.uk>, jirka@kosek.cz, whatwg@whatwg.org, public-html@w3.org, www-math@w3.org, www-svg@w3.org
- Message-Id: <0EDFDAEB-F96B-4CEF-855C-B9E1371CEE82@activemath.org>
Le 16 avr. 08 à 11:14, Henri Sivonen a écrit : > On Apr 16, 2008, at 10:47, Paul Libbrecht wrote: >> why is the whole HTML5 effort not a movement towards a really >> enhanced parser instead of trying to redefine fully HTML successors? > > text/html has immense network effects both from the deployed base > of text/html content and the deployed base of software that deals > with text/html. Failing to plug into this existing network would be > extremely bad strategy. I'm not saying that should fail nor that an enhanced parser should not care for that, it should for sure. > In fact, the reason why the proportion of Web pages that get parsed > as XML is negligible is that the XML approach totally failed to > plug into the existing text/html network effects[...] My hypothesis here is that this problem is mostly a parsing problem and not a model problem. HTML5 mixes the two. There are tools that convert quite a lot of text/html pages (whose compliance is user-defined to be "it works in my browser") to an XML stream today NeckoHTML is one of them. The goal would be to formalize this parsing, and just this parsing. >> Being an enhanced parser (that would use a lot of context info to >> be really hand-author supportive) it would define how to parse >> better an XHTML 3 page, but also MathML and SVG as it does >> currently... It has the ability to specify very readable encodings >> of these pages. >> >> It could serve as a model for many other situations where XML >> parsing is useful but its strictness bytes some. > > Anne has been working on XML5, but being able to parse any well- > formed stream to the same infoset as an XML 1.0 parser and being > able to parse existing text/html content in a backwards-compatible > way are mutually conflicting requirements. Hence, XML5 parsing > won't be suitable for text/html. I think that should be possible to minimize the conflicts if such a parsing is contextualized well. XML5 tastes like a generic attempt at flexibilizing generic xml parsing which is clearly too little flexibilization. >> Currently HTML5 defines at the same time parsing and the model and >> this is what can cause us to expect that XML is getting weaker. I >> believe that the whole model-definition work of XML is rich, has >> many libraries, has empowered a lot of great developments and it >> is a bad idea to drop it instead of enriching it. > > The dominant design of non-browser HTML5 parsing libraries is > exposing the document tree using an XML parser API. The non-browser > HTML5 libraries, therefore, plug into the network of XML libraries. > For example, Validator.nu's internals operate on SAX events that > look like SAX events for an XHTML5 document. This allows > Validator.nu to use libraries written for XML, such as oNVDL and > Saxon. So, except for needing yet another XHTML version to accomodate all wishes, I think it would be much saner that browsers' implementations and related specifications rely on an XML-based model of HTML (as the DOM is) instead of a coupled parsing-and-modelling specification which has different interpretations at different places. paul
Attachments
- application/pkcs7-signature attachment: smime.p7s
Received on Wednesday, 16 April 2008 09:59:10 UTC