- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Wed, 7 Mar 2007 21:50:15 +0200
On Mar 7, 2007, at 21:04, Elliotte Harold wrote: > There's just no one obvious way to fix all the broken markup that's > out there. TagSoup picks one approach. HTML 5 picks another. Both > will surprise people a lot of the time. At the parser level that > can't be helped. Actually, it can. The HTML5 spec allows non-browser apps to halt on a parse error. If you opt to make your HTML5 parser halt and catch fire on the first parse error, you have achieve a similar level of parser- level predictability as you could achieve by using an XML parser. > However at the document level it can be helped. When the document > author takes the care to generate a well-formed document, they are > rarely surprised by the resulting tree the parser builds. This is true for error-free HTML5 as well except for optional tags. > Does the HTML 5 fixup algorithm ever change the *tree* for a well- > formed (but invalid) document? Yes. > By contrast with a real XML parser, I can tell you what's going to > happen without cracking open the spec. Only because you are familiar with the XML spec. > HTML5, TagSoup, and XML parse trees are all deterministic and thus > predictable; but only the XML tree is *obvious*. HTML5 with halting at first error is just about as obvious as XML. -- Henri Sivonen hsivonen at iki.fi http://hsivonen.iki.fi/
Received on Wednesday, 7 March 2007 11:50:15 UTC