- From: Noah Mendelsohn <nrm@arcanedomain.com>
- Date: Fri, 02 Mar 2012 22:04:22 -0500
- To: David Lee <David.Lee@marklogic.com>
- CC: David Carlisle <davidc@nag.co.uk>, "public-xml-er@w3.org" <public-xml-er@w3.org>
On 3/2/2012 9:36 AM, David Lee wrote: > Suggestion: we say at an XML-ER parser produces an abstract data model > ... then its up to which one ... INFOSET ? XDM ? ... Probably INFOSET as > XDM drops several artifacts of XML that might be useful to upstream > parser . But then that does exclude cases such as supporting invalid XML > unicode codepoints. > > But then I havent read the INFOSET specs in years so I am going on old > man braincells here ... I still think it may be better to define the equivalence at the level of text. I am >not< necessarily saying that any particular processor must produce or go through an intermediate state that involves fixed up text. What I am suggesting we consider is a model that builds on the XML Recommendation, since that's what we're trying to "fix up to". The XML Recommendation defines XML as text. Therefore, if we can show in the XML-ER specification what the equivalent well formed XML text is that corresponds to (the fixup of) each non-wellformed input, then all the struggles about abstract data models and choosing one just goes away. Let's use the the shorthand EWXML to refer to that equivalent text. Any particular processor can produce a DOM, an (API over) an XML DM, or even the serialized XML. To prove you'd done your job right, you'd have to show that the DOM or DM or whatever is the same as the one you'd get by parsing the EWXML. As a reasonably trivial example, for non-well formed input: <e a=3 /> the EWXML might be specified to be: <e a="3" /> This seems to me conceptually simpler and also more robust that picking a favorite among DOM, Infoset and XML DM. The XML Recommendation doesn't directly deal in any of these. Noah
Received on Saturday, 3 March 2012 03:04:48 UTC