- From: Noah Mendelsohn <nrm@arcanedomain.com>
- Date: Mon, 20 Feb 2012 21:17:38 -0500
- To: David Carlisle <davidc@nag.co.uk>
- CC: public-xml-er@w3.org
On 2/20/2012 8:22 PM, David Carlisle wrote: > I agree that the input shouldn't be described as "XML" but it needn't > purport to be XML either. If I choose to parse "<foo>a</bar>" with this > parser I don't need to (or get the document to ) purport that is XML, I > just want an XML-compatible result so I can bash it with XSLT (typically) Are you sure you want to do that with your example. It's really not clear what a user intended here. Most likely XML-ER will produce some tree out of this input, but if the author intended anything like what we know as XML, the results of any fixup have at least a 50/50 chance of not being "correct" (did the user mean a "foo" element, a "bar" element, or something else. Of course, once the XML-ER spec is written, there will be some answer. Let's say the answer it gives is to assume that the </bar> was meant to be a </foo>. OK, do we really want to tell users to write <foo>a</bar> as a first class way of getting a <foo> element? I don't think so. I think we want to distinguish content that is correct or preferred from that which is tolerated. For the moment, I would assume that the "correct" content is well-formed XML. We might loosen that a bit to include some additional constructs like unquoted attributes, or perhaps names that use other than XML name characters. In general, though, I think we do want to identify a class of correct input, and I think that will be very close in spirit, if not necessarily in all details, to XML. Noah
Received on Tuesday, 21 February 2012 02:18:05 UTC