- From: Falk, Alexander <falk@icon.at>
- Date: Mon, 12 Jun 2000 20:10:36 +0200
- To: "'xmlschema-dev@w3.org'" <xmlschema-dev@w3.org>, xml-dev@xml.org
- Cc: "'www-xml-schema-comments@w3.org'" <www-xml-schema-comments@w3.org>
This message is a question concerning the use of entities to emulate namespace prefixes in a DTD - a technique that is being used by the normative DTD from the April 7 XML Schema draft. I'm therefore intentionally posting this to both xml-dev and xmlschema-dev, as it is relevant to both discussion forums: Recently one of our customers has reported an entity-resolution tech-support issue in our "XML Spy" product ( http://www.xmlspy.com ), that has resulted in a very interesting internal discussion regarding the addition of leading and trailing spaces in the resulution of parameter entities (section 4.4.8 of the XML 1.0 specification - see http://www.w3.org/TR/REC-xml#as-PE ). The problem is this - section 4.4.8 explicitely says: When a parameter-entity reference is recognized in the DTD and included, its replacement text is enlarged by the attachment of one leading and one following space (#x20) character; the intent is to constrain the replacement text of parameter entities to contain an integral number of grammatical tokens in the DTD. This section has also never been corrected by any errata (to the best of our knowledge) and the annotated XML specs also don't mention a word about it other than pointing at the SGML history issues. Now we've already seen many DTDs - and interestingly the normative XML Schema DTD from the April 7 draft (see http://www.w3.org/TR/xmlschema-1/#normative-schemaDTD ) is one of them - that uses parameter entities to make DTDs pseudo-namespace-aware. The trick most commonly used is to define a prefix and suffix entity that can then be overridden in the internal subset of any document using this DTD - and then use this prefix in defining any other element via another entity. Here is an example from the normative XML Schema DTD: <!ENTITY % p ''> <!ENTITY % s ''> <!-- if %p is defined (e.g. as foo:) then you must also define %s as the suffix for the appropriate namespace declaration (e.g. :foo) --> <!ENTITY % nds 'xmlns%s;'> <!-- Define all the element names, with optional prefix --> <!ENTITY % schema "%p;schema"> <!ENTITY % complexType "%p;complexType"> <!ENTITY % element "%p;element"> <!ENTITY % unique "%p;unique"> ... So by defining %p as 'xsd:' and %s as ':xsd' you can actually validate any XML Schema that uses xmlns:xsd to refer to the XML Schema namespace using this DTD, because this will result in all Schema elements being defined as xsd:schema, xsd:complexType, xsd:element, etc. Or so it seems. But this is where section 4.4.8 actually comes into play! If the XML Schema DTD defines an entity %schema using <!ENTITY % schema "%p;schema"> and we assume %p has already been defined as 'xsd:' then section 4.4.8 tells us that %schema will actually be defined as " xsd: schema", which is certainly not a valid qualified name and the behavior intended by the authors of the normative DTD. So the real question is: is this use of pseudo-namespace prefixes in a DTD really XML 1.0 compatible? And how should XML toolmakers interpret section 4.4.8 in the light of such use in new W3C drafts? Sincerely, Alexander Falk P.S. Last time we had a tricky XML specification question, my colleague wrote "please answer only, if you are absolutely sure" - and we received only one answer from Tim Bray, which was right to the point. So I wonder, if I shouldn't also be adding such a restriction this time ;) ... Icon Information-Systems ... ALEXANDER FALK ... President, CEO ... http://www.icon-is.com/falk ========================================================================= XML Spy 3.0 - the first true Integrated Development Environment for XML Visit http://www.xmlspy.com/ to download a free 30-day evaluation version To get a demonstration, come see us at XML DevCon in New York, June 26+27 If you like our product, please vote for us at http://www.xmlspy.com/vote =========================================================================
Received on Monday, 12 June 2000 14:10:45 UTC