- From: Shane McCarron <shane@aptest.com>
- Date: Mon, 20 Feb 2012 21:55:39 -0600
- To: public-xml-er@w3.org
On 2/20/2012 8:17 PM, Noah Mendelsohn wrote: > > > I don't think so. I think we want to distinguish content that is > correct or preferred from that which is tolerated. For the moment, I > would assume that the "correct" content is well-formed XML. We might > loosen that a bit to include some additional constructs like unquoted > attributes, or perhaps names that use other than XML name characters. > In general, though, I think we do want to identify a class of correct > input, and I think that will be very close in spirit, if not > necessarily in all details, to XML. I don't want to presuppose any solution here. Surely if the goal is that any input, regardless of how broken, is going to produce a tree, then anyone is going to be able to create a bad example. I don' t think we really need to worry about whether examples are "good" or not. I would rather focus upon the details of how bad input is predictably transformed and let these "bad" chips fall where they may. <rant> On the other hand, I actually don't think it is a great idea to transform any input, regardless of how broken. Somethings are just NOT XML. Those things are probably NOT XML-ER either. For example, the string "The quick brown fox jumped over the lazy dog." is NOT XML, and I can't imagine that it is XML-ER either. It wouldn't make any sense to me if the XML-ER rules said that a document consisting of that string is transformed into a tree by saying it is a text node that is enclosed in an anonymous element node. I would prefer that an XML-ER parser that was handed something really broken fail predictably. Encouraging the parsing of stuff that is really broken is how HTML got so messed up in the first place. </rant> -- Shane McCarron Managing Director, Applied Testing and Technology, Inc. +1 763 786 8160 x120
Received on Tuesday, 21 February 2012 03:56:10 UTC