RE: Escalation mechanism for different interpretation of W3C XML-Schema specification ?

> Hmmm -- let's leave redefine aside, as we've agreed to differ 
> on that before, but I'm surprised you recommend against 
> chameleon include.  I find it hugely useful (for those little 
> bits that you use all the time but aren't worth putting in a 
> namespace) and am not aware of any interop problems with it. . .

I've been working on a bug report relating to a Dutch government schema
published at 

http://standaarden.overheid.nl/vac/1.1/xsd/vac.xsd

which makes extensive use of redefines and chameleon include. I've fixed the
bug that stopped this working under Saxon 9.2, but I would make some
observations on the schema, which are I think rather pertinent to this
thread.

There's a cluster of three no-namespace schema documents (overheid-types,
overheid-classes, overheid-schemes) that are chameleon-included into three
different namespaces (short names /vac/, /dc/terms/, and /owms/terms/). The
effect of this is to create three near-identical copies of each of the
components defined in these three schema documents, one in each namespace.
This means that as far as XSLT and XQuery are concerned, elements/attributes
defined in terms of these types will be unrelated to each other in the type
hierarchy, which means that writing schema-aware stylesheets and queries is
likely to be very confusing. This is probably one of the main reasons I'm
not a fan of chameleon include.

But that's not all: the schema also makes heavy use of redefines.
Specifically, if we call this no-namespace cluster COMMON, we have the
structure (slightly simplified to capture the essence):

Namespace /vac/
vac.xsd includes COMMON
vac.xsd imports owms-classes-redef.xsd
vac.xsd imports overheid-classes-redef.xsd

Namespace /dc/terms/
owms-classes-redef.xsd redefines dcterms-elem.xsd
dcterms-elem.xsd includes COMMON

Namespace /owms/terms/
overheid-classes-redef.xsd redefines owns.xsd
owms.xsd includes COMMON
owms.xsd imports dcterms-elem.xsd

Note that dcterms-elem.xsd is reachable from vac.xsd via one route that
contain a "redefines" step, and by another route that omits this step (but
which does contain a different redefines step). This is where the
interpretation of "pervasiveness" is critical: Saxon takes the view that all
references to components that have been redefined are references to the
post-redefinition component. In fact the rule introduced in Saxon 9.2 (whose
incorrect implementation caused the bug) is that every component has a
redefinition level, so if A redefines B and B redefines C then a given
component may have redefinition levels of 2, 1, and 0; all references to a
component name are taken as references to the highest available redefinition
level, and if there are two different components at the highest redefinition
level, it's an error (for example, A redefines C, and B also redefines C).
There's nothing at all in the spec to justify these rules, but it's the only
way I could find of handling complex redefinition lattices that seemed to
make sense.

But the chameleon includes interfere with this (perhaps deliberately).
Because the common components have been copied into three different
namespaces, a redefine occurring in one namespace does not affect copies of
the component in a different namespace. That's Saxon's interpretation,
anyway. You could take the view that the "pervasiveness" of redefinition
makes it transcend the renaming done by the chameleon include, but I don't.

In experimenting further with this schema, I discovered that if the two
imports from vac.xsd are reversed in order, the import of
owms-classes-redef.xsd has no effect, because it is then importing a
namespace that is already known to the processor; Saxon ignores the
schemaLocation URI in this case. So the schema document
owms-classes-redef.xsd, and the redefinitions that it contains, are simply
ignored. This makes the situation very fragile: reversing the order of
imports does not make the processing fail, it just silently compiles a
different schema. I think this reinforces Henry's argument that if you're
going to redefine, then there should be one redefining document for each
namespace, which acts as a gateway to that namespace, and no other
includes/imports/redefines from elsewhere in the schema should bypass this
gateway. This schema breaks this rule, and gets away with it only because of
gateway document is encountered before the bypassing document.

Michael Kay
Saxonica

Received on Friday, 16 October 2009 13:30:19 UTC