- From: David Carlisle <davidc@nag.co.uk>
- Date: Tue, 28 Feb 2012 15:49:08 +0000
- To: Innovimax W3C <innovimax+w3c@gmail.com>
- Cc: "public-xml-er@w3.org Community Group" <public-xml-er@w3.org>
I think the simple example won't really distinguish systems that "fix up" markup as they will all pretty much just close the stack of open elements and give the same result. To distinguish things a bit it's worth looking at something a bit less like well formed XML, say <math><one<two<three</one><two></tree></math> Using <math> as an outer element has the advantage that you can test with an html5 parser (the <math> puts html5 in its "foreign content" xml-like mode where /> means what it is supposed to mean. One desirable property of XML-ER would be that it wasn't totally unlike the behaviour of HTML5 on such content. Using V.nu's parser you can see the result of parsing the above: http://livedom.validator.nu/?%3C!DOCTYPE%20html%3E%0A%3Cmath%3E%3Cone%3Ctwo%3Cthree%3C%2Fone%3E%3Ctwo%3E%3C%2Ftree%3E%3C%2Fmath%3E removing the html head and body implied in the html context results in a parse tree of <math><oneU00003CtwoU00003CthreeU00003C one=""><two></two></oneU00003CtwoU00003CthreeU00003C></math> which is what it is. I don't think it matters too much what the parse tree is. That is, I don't think it's worth trying to argue about any meaning implied by the original markup. The important thing is that html5 specifies a deterministic algorithm that returns a tree. Unless there is some overwhelming objection, I think XML-ER should return the same tree. (To be honest I haven't checked what Anne's draft spec would make of this yet). David ________________________________________________________________________ The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. ________________________________________________________________________
Received on Tuesday, 28 February 2012 15:49:39 UTC