RE: Escalation mechanism for different interpretation of W3C XML-Schema specification ? from noah_mendelsohn@us.ibm.com on 2009-10-16 (xmlschema-dev@w3.org from October 2009)

From: <noah_mendelsohn@us.ibm.com>
Date: Fri, 16 Oct 2009 14:59:13 -0400
To: "Michael Kay" <mike@saxonica.com>
Cc: "'Henry S. Thompson'" <ht@inf.ed.ac.uk>, "'XMLSchema at XML4Pharma'" <XMLSchema@XML4Pharma.com>, xmlschema-dev@w3.org
Message-ID: <OFF825BCFE.365D602F-ON85257651.0057AAD1-85257651.00685031@lotus.com>
Michael Kay writes:

> Because the common components have been copied into three different
> namespaces, a redefine occurring in one namespace does not 
> affect copies of
> the component in a different namespace. That's Saxon's interpretation,
> anyway. You could take the view that the "pervasiveness" of redefinition
> makes it transcend the renaming done by the chameleon include, 
> but I don't.

FWIW, I don't either.  More specifically, if you'd asked me not about the 
text of the Recommendation as it came out (which I think we've established 
is somewhat self-contradictory and thus interpreted differently by 
different readers), but about what I thought we were trying to say as we 
all wrote that, I agree with you.  While my ACSOOD proposal remains very 
incomplete and has a variety of problems, I think it does more or less 
signal my thinking about questions like this [1]. 

BTW: this proposal gets referenced from time to time, and as far as I can 
tell the only copy remains in member-only space.  If there's a way to do 
it without inconveniencing the chair or working group members, I will try 
to get permission post another public copy, perhaps in the W3C public 
archives.  I think it's useful for references like this to be public when 
practical.  (I am not proposing to start active discussion of it now, but 
every few years an email thread like this pops up, and I find it 
inconvenient to have links that can't be read outside of W3C.   I'll ask 
on the working group's list, which is the right place to do it.  In the 
meantime, apologies to readers outside the WG who can't see it.  FYI, we 
were (in 2004!) making a major effort to clarify the composition story for 
XSD 1.1.  The paper reference at [1] was an experimental attempt by me to 
create a design that would capture what I thought we intended in XSD 1.0. 
I think there are some good ideas in it, but it's also incomplete and has 
a number of problems.  Other important proposals were made by other 
working group members, many months were spent trying to extract from all 
of this a story that would garner consensus as a better explanation than 
the one we had in XSD 1.0, and to a signficant degree we failed.  So, I 
refer to this only because with respect to issues like the ones Mike 
raises, some of my thinking is captured in more detail at [1].

Noah

[1] 
http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2004Jul/att-0004/CompositionArchitecture.html


--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------








"Michael Kay" <mike@saxonica.com>
Sent by: xmlschema-dev-request@w3.org
10/16/2009 09:29 AM
 
        To:     "'Henry S. Thompson'" <ht@inf.ed.ac.uk>
        cc:     "'XMLSchema at XML4Pharma'" <XMLSchema@XML4Pharma.com>, 
<xmlschema-dev@w3.org>, (bcc: Noah Mendelsohn/Cambridge/IBM)
        Subject:        RE: Escalation mechanism for different 
interpretation of W3C XML-Schema specification ?


> Hmmm -- let's leave redefine aside, as we've agreed to differ 
> on that before, but I'm surprised you recommend against 
> chameleon include.  I find it hugely useful (for those little 
> bits that you use all the time but aren't worth putting in a 
> namespace) and am not aware of any interop problems with it. . .

I've been working on a bug report relating to a Dutch government schema
published at 

http://standaarden.overheid.nl/vac/1.1/xsd/vac.xsd

which makes extensive use of redefines and chameleon include. I've fixed 
the
bug that stopped this working under Saxon 9.2, but I would make some
observations on the schema, which are I think rather pertinent to this
thread.

There's a cluster of three no-namespace schema documents (overheid-types,
overheid-classes, overheid-schemes) that are chameleon-included into three
different namespaces (short names /vac/, /dc/terms/, and /owms/terms/). 
The
effect of this is to create three near-identical copies of each of the
components defined in these three schema documents, one in each namespace.
This means that as far as XSLT and XQuery are concerned, 
elements/attributes
defined in terms of these types will be unrelated to each other in the 
type
hierarchy, which means that writing schema-aware stylesheets and queries 
is
likely to be very confusing. This is probably one of the main reasons I'm
not a fan of chameleon include.

But that's not all: the schema also makes heavy use of redefines.
Specifically, if we call this no-namespace cluster COMMON, we have the
structure (slightly simplified to capture the essence):

Namespace /vac/
vac.xsd includes COMMON
vac.xsd imports owms-classes-redef.xsd
vac.xsd imports overheid-classes-redef.xsd

Namespace /dc/terms/
owms-classes-redef.xsd redefines dcterms-elem.xsd
dcterms-elem.xsd includes COMMON

Namespace /owms/terms/
overheid-classes-redef.xsd redefines owns.xsd
owms.xsd includes COMMON
owms.xsd imports dcterms-elem.xsd

Note that dcterms-elem.xsd is reachable from vac.xsd via one route that
contain a "redefines" step, and by another route that omits this step (but
which does contain a different redefines step). This is where the
interpretation of "pervasiveness" is critical: Saxon takes the view that 
all
references to components that have been redefined are references to the
post-redefinition component. In fact the rule introduced in Saxon 9.2 
(whose
incorrect implementation caused the bug) is that every component has a
redefinition level, so if A redefines B and B redefines C then a given
component may have redefinition levels of 2, 1, and 0; all references to a
component name are taken as references to the highest available 
redefinition
level, and if there are two different components at the highest 
redefinition
level, it's an error (for example, A redefines C, and B also redefines C).
There's nothing at all in the spec to justify these rules, but it's the 
only
way I could find of handling complex redefinition lattices that seemed to
make sense.

But the chameleon includes interfere with this (perhaps deliberately).
Because the common components have been copied into three different
namespaces, a redefine occurring in one namespace does not affect copies 
of
the component in a different namespace. That's Saxon's interpretation,
anyway. You could take the view that the "pervasiveness" of redefinition
makes it transcend the renaming done by the chameleon include, but I 
don't.

In experimenting further with this schema, I discovered that if the two
imports from vac.xsd are reversed in order, the import of
owms-classes-redef.xsd has no effect, because it is then importing a
namespace that is already known to the processor; Saxon ignores the
schemaLocation URI in this case. So the schema document
owms-classes-redef.xsd, and the redefinitions that it contains, are simply
ignored. This makes the situation very fragile: reversing the order of
imports does not make the processing fail, it just silently compiles a
different schema. I think this reinforces Henry's argument that if you're
going to redefine, then there should be one redefining document for each
namespace, which acts as a gateway to that namespace, and no other
includes/imports/redefines from elsewhere in the schema should bypass this
gateway. This schema breaks this rule, and gets away with it only because 
of
gateway document is encountered before the bypassing document.

Michael Kay
Saxonica
Received on Friday, 16 October 2009 19:00:04 UTC