Re: How to scope the note about D and override(E,D)

[second of two replies]
C. M. Sperberg-McQueen writes:

> But in the general case, if we have a document cycle D1, D2, ... Dn,
> D1 where D1 contains an override element E1, and we are trying to
> calculate schema(D1), it will often happen that the target set of E1
> will include not override(E1, D1) but override(E', D1).  This will
> happen if any of the override elements in D2 ... Dn have children
> distinct from every child of E1.  E' will be the result of overlaying
> E2 with E1, overlaying E3 with that result, overlaying E4 with that,
> etc. until you have overlaid the override element of Dn with the
> result of all the previous overlays.

I think it's important that we address this case explicitly.
> . . .

> As for the more general case, I think it is covered by the general
> rule specified in 4.2.3 concerning 'include' elements:
>
>     If [two include elements] specify different schema locations, 
>     then they refer to different schema documents, unless the 
>     implementation is able to determine that the two URIs are 
>     references to the same resource.
>
> Here the spec explicitly gives permission to a processor to determine
> that two schema-document references are references to the same schema
> document.

I thought we added this to cover the redirect case! 

> I think it follows automatically that a processor is allowed to
> detect a situation where (a) a schema document D contains an
> override element E, and the target set of E includes not
> override(E,D) but override(E',D), and (b) the resource denoted by
> override(E',D) is "the same resource" as D.  (At this point, the
> ineffable mysteries of web architecture block us from further
> reasoning.)

I think that's asking a lot of our readers . . .

> And once I have detected that D and override(E', D) are "the same
> resource", I am allowed to snip the cycle, by (a) the rules governing
> include and (b) the statement in the spec that the meaning of an
> override of a schema document SD is the same as that of an include for
> a different schema document override(E,SD).

I would prefer to be unequivocal about all this in an explicit way.
Algorithm O achieves this, and I think we're now close to being able to
backport it, as it were, to the kind of algebraic formulation you
would prefer.

Let's look how algorithm O [corrected version attached] deals with the
problem of detecting the relation between E' and E, that is, between
two possibly identical or overlapping sets of overrides.

O articulates an additional level of detail in what your algorithm
(call it F) calls schema document designators.  Rather than the pair
of (document URI, override element), packaged as "override(E,D)" which
F uses, O uses (document URI, a _set_ of what it calls 'markers').
Markers amount to SCDs for the _children_ of an override element.  This
allows O to not only detect and handle cases where, as in the examples
discussed already in this thread, "E = E'", but also cases of subset,
superset and non-empty intersection between E and E'.

This is because a marker is just a tuple of strings, and so can be
compared for equality with another marker without invoking any theory
of component identity.

O exploits this in two ways:

 1) If an override needs to be processed whose target has already been
    processed with superset of the required markers, the override can
    be ignored;

 2) If a given schema document D is involved via more than one path of
    overrides, actually constructing schema(D) can be done efficiently
    by taking the union of all the markers which apply to it.

As a litmus test of the alleged value of this, I'm curious what you
believe to be the case wrt the test over024, which its contributor
Mike Kay classifies as 'invalid'.  Algorithm O classifies it as valid
(as far as override is concerned, that is - I'm assuming zuludate was
meant to be zuluDate, but that's irrelevant as regards the issue at
hand), with two components:

 element 'doc' of type zuludate
 simpleType 'zuluDate' a restriction of xs:time

I think I know what algorithm F says, but I'd like to hear your take
first. . .

A third example follows in the next message.

ht
-- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 651-1426, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
 [mail from me _always_ has a .sig like this -- mail without it is forged spam]

Received on Monday, 14 March 2011 14:26:45 UTC