Re: How to scope the note about D and override(E,D)

On Mar 14, 2011, at 8:24 AM, Henry S. Thompson wrote:

> [second of two replies]
> C. M. Sperberg-McQueen writes:
> ...
>> As for the more general case, I think it is covered by the general
>> rule specified in 4.2.3 concerning 'include' elements:
>> 
>>    If [two include elements] specify different schema locations, 
>>    then they refer to different schema documents, unless the 
>>    implementation is able to determine that the two URIs are 
>>    references to the same resource.
>> 
>> Here the spec explicitly gives permission to a processor to determine
>> that two schema-document references are references to the same schema
>> document.
> 
> I thought we added this to cover the redirect case! 

And?

We added it.  It says that if a schema processor can determine
that X and Y are the same resource, then the processor may
regard X and Y as being the same schema document.

That rule has consequences.  Those consequences include
(a) the possibility of different results from different processors
and (b) the possibility that a processor may detect identity
of resources in cases where the inclusion in question is 
pointing to the output of the override transform.

> 
>> I think it follows automatically that a processor is allowed to
>> detect a situation where (a) a schema document D contains an
>> override element E, and the target set of E includes not
>> override(E,D) but override(E',D), and (b) the resource denoted by
>> override(E',D) is "the same resource" as D.  (At this point, the
>> ineffable mysteries of web architecture block us from further
>> reasoning.)
> 
> I think that's asking a lot of our readers . . .

Our spec does ask a lot of our readers.  

I wish it were easier to follow and that reading it required less
skill in casuistry.  But nothing we do short of starting work on 
XSD 2.0 from a blank sheet of paper can possibly change that
in any serious way. 

I'm happy to make some of the necessary implications of 
our rule about schema document identity more explicit, if
members of the WG or other readers of this list will help by
making clear which of the implications of that rule are (a) not
clear on relatively careful reading and (b) useful in understanding
how the rules for handling cycles of schema document reference
can be handled.  

> 
> Let's look how algorithm O [corrected version attached] deals with the
> problem of detecting the relation between E' and E, that is, between
> two possibly identical or overlapping sets of overrides.
> 
> O articulates an additional level of detail in what your algorithm
> (call it F) calls schema document designators.  Rather than the pair
> of (document URI, override element), packaged as "override(E,D)" which
> F uses, O uses (document URI, a _set_ of what it calls 'markers').
> Markers amount to SCDs for the _children_ of an override element.  This
> allows O to not only detect and handle cases where, as in the examples
> discussed already in this thread, "E = E'", but also cases of subset,
> superset and non-empty intersection between E and E'.


Does this work in the general case?  I would expect that it would not
suffice to know that type T is being overridden, that it would be
necessary to know what the new definition is.


> This is because a marker is just a tuple of strings, and so can be
> compared for equality with another marker without invoking any theory
> of component identity.

I don't see any appeal to component identity in the current
design.  At most there is an appeal to element equivalence.


> 
> O exploits this in two ways:
> 
> 1) If an override needs to be processed whose target has already been
>    processed with superset of the required markers, the override can
>    be ignored;

I think there are two problems with this behavior.

1 If documents A and B each override C, with elements E1 and E2
respectively, and E1 and E2 provide declarations for type 
T (and nothing else), the rule you just stated suggests we can ignore
the override of C by B, if we saw A first.  And we can ignore the
override of C by A, if we saw B first.

But the spec provides no rules saying when schema documents must
be processed; having explicit order dependencies is a mistake in
the spec, and adding new ones now is not a good idea.

2 Whenever the declarations E1 and E2 provide for T are in
conflict, the rule just stated resolves the conflict in favor of one
or the other.  I think the correct answer (certainly, the answer we
chose in our phase-1 discussions of bug 6021) is that an error
should result.


> 
> 2) If a given schema document D is involved via more than one path of
>    overrides, actually constructing schema(D) can be done efficiently
>    by taking the union of all the markers which apply to it.

That suggests that if A and B each override C, with E1 providing a 
new declaration of type T1 and E2 providing a new declaration of T2,
then the result should be the same as if they agreed and both
E1 and E2 overrode both T1 and T2, in the same way.  

That would be a dramatic change from the status quo design, in 
which A is taken to want C's original declaration of T2 and B is
taken to want C's original declaration of T1.

I don't see the motivation for such a rule.

It's possible that the paraphrases you've just given glide over some
details of the algorithm and that the problems lie not in the algorithm
but in the paraphrases.  I will try to study the algorithm later, but
so far I'm stuck on the first sentence, which says that our goal is
to construct a tree with no duplicate leaves.   The implicit suggestion
that a tree might have duplicate leaves persuades me that I must
be missing something here -- either 'tree' or 'duplicate' must mean
something I don't understand.  

> 
> As a litmus test of the alleged value of this, I'm curious what you
> believe to be the case wrt the test over024, ...
> A third example follows in the next message.
> 

I'll respond to these separately.

Michael

-- 
****************************************************************
* C. M. Sperberg-McQueen, Black Mesa Technologies LLC
* http://www.blackmesatech.com 
* http://cmsmcq.com/mib                 
* http://balisage.net
****************************************************************

Received on Monday, 14 March 2011 17:57:49 UTC