- From: John Cowan <cowan@ccil.org>
- Date: Thu, 19 Nov 2009 03:21:14 -0500
- To: Simon Pieters <simonp@opera.com>
- Cc: John Cowan <cowan@ccil.org>, Lachlan Hunt <lachlan.hunt@lachy.id.au>, Liam Quin <liam@w3.org>, public-html@w3.org, public-xml-core-wg@w3.org
Simon Pieters scripsit: > Why would one need to reverse engineer an XML parser? It is defined in XML > 1.0 what is an error, so one can just read the XML 1.0 spec and modify the > XML5 algorithm accordingly. Sure, it's possible, but it's about equivalent in complexity to writing a parser, which has already been done repeatedly. Wake me up when it's finished. > It's not clear to me that that is a goal. It would be possible by making > up a bogus root element, but that seems just bogus. :-) Fair enough, but then there needs to be some kind of restriction on what documents can and cannot be repaired. > I see "DOCTYPE internal subset state" and in total 38 tokenizer states > dedicated to handling the internal subset in > http://xml5.googlecode.com/svn/trunk/specification/Overview.html Yes, it skips the internal subset all right, but there's no indication that it uses the information to, for example, correctly implement attribute value normalization. Whitespace characters are added to attribute values just like any other characters. -- Mos Eisley spaceport. You will never John Cowan see a more wretched hive of scum and cowan@ccil.org villainy --unless you watch the http://www.ccil.org/~cowan Jerry Springer Show. --georgettesworld.com
Received on Thursday, 19 November 2009 08:21:54 UTC