Re: block="substitution" from C. M. Sperberg-McQueen on 2007-01-26 (xmlschema-dev@w3.org from January 2007)

From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
Date: Thu, 25 Jan 2007 18:16:52 -0700
To: Zafar Abbas <Zafar.Abbas@microsoft.com>, Michael Kay <mike@saxonica.com>
Cc: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>, xmlschema-dev@w3.org
Message-Id: <2AECE44C-BBA0-4177-BAEB-D17748533282@acm.org>

On 19 Jan 2007, at 14:58 , Zafar Abbas wrote:

> I agree with your analysis. ... But you are right that this looks  
> more like an issue which is
> better detected at schema compilation time as no one will ever be able
> to substitute E with F.
>
> -----Original Message-----
> From: xmlschema-dev-request@w3.org [mailto:xmlschema-dev- 
> request@w3.org]
> On Behalf Of Michael Kay
> Sent: Saturday, December 09, 2006 8:18 AM
> To: xmlschema-dev@w3.org
> Subject: block="substitution"
>
>
> As far as I can see the following schema is valid:
>
> <xsd:element name="E" type="A" block="substitution" />
>
> <xsd:element name="F" type="A" substitutionGroup="E" />
>
> The effect of the "block" is not to make the declaration of F invalid:
> instead, it effectively causes the attribute substitutionGroup="E"  
> to be
> ignored, so that an instance that attempts to use F in place of E will
> be
> invalid.
>
> Is this analysis correct? On the surface, this seems to be  
> detecting an
> error at run time that would be better detected at compile time.
>
> Can someone explain the rationale?

I can try.

The basic idea (not terribly clear in the text of either 1.0 or 1.1,
I'm afraid) is that what gets blocked by 'block' is actual
substitution of another element type in the instance, not substitution
group affiliation.

Consider the following schema:

   <xsd:element name="E1" type="A" />
   <xsd:element name="E2" substitutionGroup="E1" block="substitution" />
   <xsd:element name="E3" substitutionGroup="E2" />

E2 is in the substitution group of E1.

E3 is not in the substitution group of E2, because E2 has
block="substitution".  But because E3 has substitutionGroup="E2", it
is a "potential member" of E2's substitution group, and it is
therefore transitively also a potential member of the substitution
group of E1.  Since E1 doesn't block substitution, E3 is thus not only
a potential member but also an actual member of E1's substitution
group.

E1's substitution group is the set {E2, E3}, and that is also the set
of potential members of its substitution group.

E2's (actual) substitution group is the empty set, but the set
of potential members is the singleton set {E3}.

Thus, when a content model calls for E1, we may write E1, E2, or E3.
When a content model calls for E2, we may write E2 and only E2.

To know what elements are in the substitution group of a particular
element E, and thus can be substituted for it, you must look, of
course, first at the elements which name E as their substitution group
head.  Then you must add -- not the members of their substitution
groups, but the *potential* members of their substitution groups.  It
is not substitution group membership which is transitive, but
potential substitution group membership.

There were some WG members, when this design was proposed, who
ventured to suggest that readers and users might find it a little
confusing.  But those with more robust faith in the patience and
ability of the reader to navigate arbitrarily subtle distinctions ended
by carrying the day.

Oh.  Dear me, I promised I'd try to explain the rationale.  Er.  Ah.
Those who favored the current design argued that it would be more
useful this way, and that it would be inconvenient to prevent E3 from
being substituted for E1 merely because E2 blocked substitution.

I hope this helps.

--C. M. Sperberg-McQueen

Received on Friday, 26 January 2007 01:17:00 UTC