W3C home > Mailing lists > Public > public-xml-er@w3.org > February 2012

Re: Draft

From: Shane McCarron <shane@aptest.com>
Date: Mon, 20 Feb 2012 21:55:39 -0600
Message-ID: <4F4315BB.8070306@aptest.com>
To: public-xml-er@w3.org

On 2/20/2012 8:17 PM, Noah Mendelsohn wrote:
> I don't think so. I think we want to distinguish content that is 
> correct or preferred from that which is tolerated. For the moment, I 
> would assume that the "correct" content is well-formed XML. We might 
> loosen that a bit to include some additional constructs like unquoted 
> attributes, or perhaps names that use other than XML name characters. 
> In general, though, I think we do want to identify a class of correct 
> input, and I think that will be very close in spirit, if not 
> necessarily in all details, to XML.

I don't want to presuppose any solution here.  Surely if the goal is 
that any input, regardless of how broken, is going to produce a tree, 
then anyone is going to be able to create a bad example.  I don' t think 
we really need to worry about whether examples are "good" or not.  I 
would rather focus upon the details of how bad input is predictably 
transformed and let these "bad" chips fall where they may.


On the other hand, I actually don't think it is a great idea to 
transform any input, regardless of how broken.  Somethings are just NOT 
XML.  Those things are probably NOT XML-ER either.

For example, the string "The quick brown fox jumped over the lazy dog." 
is NOT XML, and I can't imagine that it is XML-ER either.  It wouldn't 
make any sense to me if the XML-ER rules said that a document consisting 
of that string is transformed into a tree by saying it is a text node 
that is enclosed in an anonymous element node.  I would prefer that an 
XML-ER parser that was handed something really broken fail predictably.  
Encouraging the parsing of stuff that is really broken is how HTML got 
so messed up in the first place.


Shane McCarron
Managing Director, Applied Testing and Technology, Inc.
+1 763 786 8160 x120
Received on Tuesday, 21 February 2012 03:56:10 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 19:47:26 UTC