- From: Noah Mendelsohn <nrm@arcanedomain.com>
- Date: Sat, 03 Mar 2012 14:48:33 -0500
- To: David Carlisle <davidc@nag.co.uk>
- CC: public-xml-er@w3.org
On 3/3/2012 1:20 PM, David Carlisle wrote: > It would preclude any implementation of xml-er that used any kind of > parsing. It restricts you to essentially the kind of fixup that you an > do with regular expressions, just keeping the textual document but never > parsing it. Hmm. There's a something a bit circular about this. Certainly your input can't be in the form of somethink like a DOM, or else how could it represent just the sorts of things like poorly nested tags that are exactly the sorts of things we are trying to fix up. So, I assume it's OK to assume that the input is a string of text that may or may not prove to be well formed? OK, so you're definitely going to run some sort of parse on it, do some error checking while you go, and prepare the output as you go. I infer that you're interested in the case where your preferred output is, say, a DOM, and you're building it as you go. You're losing track of, e.g. whether an attribute was single or double quoted. No problem. Please look again that the requirement I proposed. It was not that you be capable of reserializing the original document. Rather it was: "...when XML-ER is used on well formed input to produce (take your pick of {DOM, XML-DM, Infoset, text file}), the results should be the same as if a (non-XML-ER) tool was used. " ..and that's almost surely what you're going to do: you're going to build the same DOM you would have if you weren't prepared to do error recovery. The fact that, if you tried to re-serialize the DOM you wouldn't remember what quoting is used is no problem. That's why I stated the requirement that way. I think you can do exactly what you want with the requirement as I phrased it. Am I missing something? Thanks. Noah
Received on Saturday, 3 March 2012 19:48:59 UTC