- From: David Carlisle <davidc@nag.co.uk>
- Date: Sun, 26 Feb 2012 21:16:07 +0000
- To: public-xml-er@w3.org
On 26/02/2012 16:16, David Lee wrote: > > So I'd like to discuss: What is the expected purpose/use case of an > implementation of XML-ER? > > Possible answers ? > > A) XML-ER is a 'Processor' I suspect it has to be this. > > 1) A drop-in replacement for an XML parser. > > --> Implies: It must do *everything* an XML parser does (plus the ER stuff) Well no, it implies that it has to be a full parser spec, we could decide that it didn't do everything an xml parser would do in all circumstances. (We could define for example that it never fetched external entities (compatible with a configuration of xml 1.x) and always skipped entity definitions even in a local subset (not compatible with xml 1.x)) > > --> Output: An API ? An abstract data model ? (INFOSET) I'd say any kind of abstract tree model. Current draft uses the terminology of the DOM which isn't my favourite tree model but if we think DOM based browsers are a likely user of this spec, then using the terminology of the DOM (but saying somewhere any tree model is OK) makes some kind of sense to me. > > 2) A pre-processor for an XML parser. > > --> Input : "Stuff" (TBD) > > --> Output : Well-formed XML - defined as an abstract data model? > > --> Implies an XML parser then may be used to fill in the stuff that > ER-XML doesn't define, It only implies that if we constrain the "fixup" that can be performed to fixup that doesn't require xml parsing. If a document is not well formed because entities or parameter entities are messed up, then either you need a full xml parse (more or less) to untangle the entities and fix what needs fixing, or you unconditionally remve all dtd references, or you can't guarantee the output is well formed. I'm not necessarily opposed to this model of spec, but currently can't see how it would work. > > For example: parsing DTDs, external entities etc. see above, if you don't parse the dtd during the fix up stage, do you remove the dtd, or just leave it unfixed? > ... > --> Example: The Namespaces spec doesn't define a 'processor'. true although many of of the problems of namespaces are arguably due to that fact. It tries to layer itself a layer above the xml parse (like a schema validator) but naming rules are fundamental and that layering violation shows through in all the rough edges around namespace declarations looking like attributes and being attributes in some models (DOM) but not in others (XDM), > > IMHO we need to clarify exactly what the XML-ER specification is > intended for before we can make much more progress. yes I guess so:-) David To make things concrete. What would you _want_ the output from this to be? <!DOCTYPE foo [ <!ENTITY a "a"> <!ENTITY b "<b>"> ]> <foo> &a;&b; </foo> My suggestion is that doctype declarations only be parsed to the extent that they be skipped and that the only entity references used are the html/mathml ones so I'd suggest the output (whether you think of this as fix-up giving the xml document, or as a representation of the output tree of an xml-er processor) <foo> &a;&b; </foo> Note that I would suggest getting the _same_ result if the input were <!DOCTYPE foo [ <!ENTITY a "a"> <!ENTITY b "b"> ]> <foo> &a;&b; </foo> which, unlike the first document, is well formed. >
Received on Sunday, 26 February 2012 21:16:28 UTC