- From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
- Date: Tue, 24 Oct 2006 10:58:46 -0600
- To: Stan Kitsis <skits@microsoft.com>
- Cc: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>, "Moog, Thomas H" <thomas.h.moog@intel.com>, "xmlschema-dev@w3.org" <xmlschema-dev@w3.org>
On 23 Oct 2006, at 23:24 , Stan Kitsis wrote: > >> Now, if you changed your example, so that in alpha the 'b' >> element was defined as having type xsd:gYear, for example, >> and then brought back in gamma with the type xsd:anyURI, >> then in principle conforming processors should reject it. > > Can you explain why? The following schema seems valid to me and > the .NET 2.0 processor agrees with me. Certainly. Two levels of explanation may be useful. First, at the level of the words in the spec which formulate the rule, and second at the level of the design rationale. Clause 1.5 of Schema Component Constraint: Derivation Valid (Extension) says that it must in principle be possible to derive the complex type definition in two steps, the first an extension and the second a restriction (possibly vacuous), from that type definition among its ancestors whose {base type definition} is the ur-type definition. > <xs:complexType name="alpha"> > <xs:sequence> > <xs:element name="a" /> > <xs:element name="b" type="xs:gYear" minOccurs="0" /> > </xs:sequence> > </xs:complexType> The type alpha is "that type definition ... whose {base type definition} is the ur-type definition. The point of the clause is to ensure the truth of some assumptions one might plausibly want to make about instances of type alpha and of any type descended from it. Prominent among them would be propositions like: 1 There will always be a child named 'a'. 2 When there is a child named 'a', its type will be xsd:anyType, or something derived from it. 3 There may or may not be a child named 'b'. 4 When there is a child named 'b', its type will be xsd:gYear, or something derived from it. Other invariants are in fact guaranteed, but these will do to go on with. > <xs:complexType name="beta" > > <xs:complexContent> > <xs:restriction base="alpha" > > <xs:sequence> > <xs:element name="a" /> > </xs:sequence> > </xs:restriction> > </xs:complexContent> > </xs:complexType> > > <xs:complexType name="gamma" > > <xs:complexContent> > <xs:extension base="beta" > > <xs:sequence> > <xs:element name="b" type="xs:anyURI"/> > </xs:sequence> > </xs:extension> > </xs:complexContent> > </xs:complexType> Clause 1.5 requires that the effective content model of gamma be expressible as the result of (1) extending alpha, possibly vacuously, and then (2) restricting that extension, possibly vacuously. If we can construct such a two-step derivation that goes from alpha to gamma, then you are right that gamma should be legal. I believe we cannot. My reasoning may be clearer if we consider an example. Consider this possible derivation. The type delta extends alpha by adding a 'b' element with a type of xs:anyURI: <xs:complexType name="delta" > <xs:complexContent> <xs:extension base="alpha" > <xs:sequence> <xs:element name="b" type="xs:anyURI"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> The effective content model of delta is thus: <xs:sequence> <xs:element name="a" /> <xs:element name="b" type="xs:gYear" minOccurs="0" /> <xs:element name="b" type="xs:anyURI"/> </xs:sequence> If delta is legal, then we could derive a type epsilon from it by restriction, with an effective content model that is the same as that of gamma: <xs:complexType name="epsilon" > <xs:complexContent> <xs:restriction base="delta" > <xs:sequence> <xs:element name="a" /> <xs:element name="b" type="xs:anyURI"/> </xs:sequence> </xs:restriction> </xs:complexContent> </xs:complexType> The problem with this derivation is that type delta is not a legal type: it has two element declarations which map the same expanded name ('b') to different types (xs:gYear and xs:anyURI), which violates the Element Declarations Consistent constraint. I think it's clear that any attempt to derive gamma from alpha by means of first an extension step and then a restriction step must fail. The intermediate type must contain a 'b' element with type xs:anyURI, in order for gamma to get it from there. It must also contain a 'b' element with type xs:gYear, since alpha has one, and the intermediate type is an extension of alpha, and extensions cannot get rid of things in their base type. That means that the intermediate type must have two 'b' elements with different types, and thus that the intermediate type must violate the Element Declarations Consistent rule. Clause 1.5 turns out to make it legal (as the original example from Thomas Moog shows) to take an element or attribute away, and then put it back. When we drafted the Note that said nothing taken away can be put back, we failed to foresee that as a possibility, so the Note implies that it's not possible. When the WG came to consider the discrepancy between the actual rule in clause 1.5 and the characterization in the Note, we concluded that the point of the rule is to preserve propositions like those numbered 1-4 earlier in this email. The rule implicit in the Note would have implied further than in the derivation graph showing all the descendants of alpha, those types which possess a 'b' must be a connected subgraph. That didn't seem a particularly important or useful property, so the WG elected to change the non-normative Note rather than the normative rule. I hope I've answered your question about why this example should not be accepted by conforming schema processors, both at the level of "where does it say that in the spec?" and at the level of "why should it say that in the spec?". --C. M. Sperberg-McQueen World Wide Web Consortium / MIT CSAIL
Received on Tuesday, 24 October 2006 16:59:09 UTC