Re: [xml-dev] RE: Is schemaLocation just a hint in schema import? from C. M. Sperberg-McQueen on 2006-10-16 (xmlschema-dev@w3.org from October 2006)

From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
Date: Mon, 16 Oct 2006 15:25:26 -0600
To: Dan Vint <dvint@dvint.com>
Cc: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>, "Michael Kay" <mike@saxonica.com>, "'Antoli, Leo'" <Leo.Antoli@Misys.com>, <xml-dev@lists.xml.org>, <xmlschema-dev@w3.org>
Message-Id: <0F91E4A7-25BA-4789-9065-37575D518BDC@acm.org>
On 15 Oct 2006, at 13:08 , Dan Vint wrote:

 > This seems another instance where the schema folks changed the
 > path of the XML standard. In a DTD, it is the last definition
 > read that becomes the definition - seems like the import rules
 > should treat schemas in the same way.

Either your memory is playing tricks on you, or mine is playing
tricks on me.  I believe the rule in SGML and XML DTDs is
normally either that multiple declarations are illegal (you
mustn't declare the same element twice, for example) or the first
one wins (entities, attributes).

Sec. 4.2 of the XML spec says

     If the same entity is declared more than once, the first
     declaration encountered is binding; at user option, an XML
     processor MAY issue a warning if entities are declared
     multiple times.

Sec. 3.3 says

     When more than one definition is provided for the same
     attribute of a given element type, the first declaration is
     binding and later declarations are ignored.

But the parallel you are drawing seems off to me: an import
functions in part like a declaration (it declares that references
to components in the imported namespace should be legal in the
schema document with the xsd:import element), and in part like an
xsd:include (in the sense that the user will frequently want the
processor to go find components for the imported namespace,
possibly using the hint given as to location).

To the extent that xsd:import acts like a declaration, the
question of first-wins vs. last-wins makes no sense.  Both
declarations do exactly the same thing - they make references to
components in that namespace legal.  So there's no question of a
conflict.

If the two carry different schemaLocation hints, then in a
situation where we are following the location hints, the correct
analogy seems to me to be with entity declarations, where XML
Schema 1.0 follows the same first-wins rule as XML.

You may perhaps be thinking of the last-wins rule of XSLT 1.0.
Sec. 5.5 of XSLT 1.0 says

     It is an error if this leaves more than one matching template
     rule. An XSLT processor may signal the error; if it does not
     signal the error, it must recover by choosing, from amongst
     the matching template rules that are left, the one that
     occurs last in the stylesheet.

 > The only place that seems like it needs special handling is
 > with redefine in the situation that I just got caught in.

 > I have Schema A that is imported into Schema B, I then redefine
 > Schema A and import Schema B. This also seems to be a
 > implementation dependent problem and the solution that was
 > proposed but I haven't tested was to remove the schema location
 > on the import of Schema A into schema B. I find it strange that
 > some how the physical file location might control part of the
 > triggering of this rule. Seems to me that the namespace should
 > be the equivalent of specifying a file location, but maybe it
 > isn't.

This seems to be an instance similar to the one described in
Bugzilla issue 2577
(http://www.w3.org/Bugs/Public/show_bug.cgi?id=2577), about what
happens when the same schema document is both included and
redefined (normally, as in your description, via different
routes).

Your processor may be taking the simple and often unproblematic
approach of handling a schema location hint on an import by
retrieving the schema document indicated and adding the relevant
components to the schema being assembled -- i.e., treating the
import a lot like an 'include'.  If so, then you can see why it's
running into problems and why.  You import B, and when it sees
the import with a hint pointing to schema document A, it goes and
reads A and adds the components defined in A to the schema it's
collecting.  Then it sees your redefine, and it builds components
from schema document A, modifying them as specified in the
redefine.  For each thing you redefine in A, it's now got two
copies, one the original and one the modified copy.  Not a legal
schema.

 > How about a push to simplify XML schema instead of changing the
 > syntax and features of the core XML spec?

The Working Group is working as hard as we can to make XML Schema
1.1 clearer than 1.0, to fix bugs, and to add useful
functionality.  Simplification in the form of eliminating
features has proven to be a very hard sell -- almost everyone
agrees that there are a lot of features no one would miss, but
the lists people give don't converge.  Just this morning I had a
conversation in which someone suggested that the easiest way to
simplify the description of schema composition would be to
eliminate xsd:redefine entirely.  They are probably right, as far
as it goes, but if you are using redefine you might feel that
that would be going about simplification the wrong way.

best regards,

--C. M. Sperberg-McQueen
   World Wide Web Consortium
Received on Monday, 16 October 2006 21:25:46 UTC