- From: Michael Kay <mike@saxonica.com>
- Date: Fri, 16 Oct 2009 14:29:34 +0100
- To: "'Henry S. Thompson'" <ht@inf.ed.ac.uk>
- Cc: "'XMLSchema at XML4Pharma'" <XMLSchema@XML4Pharma.com>, <xmlschema-dev@w3.org>
> Hmmm -- let's leave redefine aside, as we've agreed to differ > on that before, but I'm surprised you recommend against > chameleon include. I find it hugely useful (for those little > bits that you use all the time but aren't worth putting in a > namespace) and am not aware of any interop problems with it. . . I've been working on a bug report relating to a Dutch government schema published at http://standaarden.overheid.nl/vac/1.1/xsd/vac.xsd which makes extensive use of redefines and chameleon include. I've fixed the bug that stopped this working under Saxon 9.2, but I would make some observations on the schema, which are I think rather pertinent to this thread. There's a cluster of three no-namespace schema documents (overheid-types, overheid-classes, overheid-schemes) that are chameleon-included into three different namespaces (short names /vac/, /dc/terms/, and /owms/terms/). The effect of this is to create three near-identical copies of each of the components defined in these three schema documents, one in each namespace. This means that as far as XSLT and XQuery are concerned, elements/attributes defined in terms of these types will be unrelated to each other in the type hierarchy, which means that writing schema-aware stylesheets and queries is likely to be very confusing. This is probably one of the main reasons I'm not a fan of chameleon include. But that's not all: the schema also makes heavy use of redefines. Specifically, if we call this no-namespace cluster COMMON, we have the structure (slightly simplified to capture the essence): Namespace /vac/ vac.xsd includes COMMON vac.xsd imports owms-classes-redef.xsd vac.xsd imports overheid-classes-redef.xsd Namespace /dc/terms/ owms-classes-redef.xsd redefines dcterms-elem.xsd dcterms-elem.xsd includes COMMON Namespace /owms/terms/ overheid-classes-redef.xsd redefines owns.xsd owms.xsd includes COMMON owms.xsd imports dcterms-elem.xsd Note that dcterms-elem.xsd is reachable from vac.xsd via one route that contain a "redefines" step, and by another route that omits this step (but which does contain a different redefines step). This is where the interpretation of "pervasiveness" is critical: Saxon takes the view that all references to components that have been redefined are references to the post-redefinition component. In fact the rule introduced in Saxon 9.2 (whose incorrect implementation caused the bug) is that every component has a redefinition level, so if A redefines B and B redefines C then a given component may have redefinition levels of 2, 1, and 0; all references to a component name are taken as references to the highest available redefinition level, and if there are two different components at the highest redefinition level, it's an error (for example, A redefines C, and B also redefines C). There's nothing at all in the spec to justify these rules, but it's the only way I could find of handling complex redefinition lattices that seemed to make sense. But the chameleon includes interfere with this (perhaps deliberately). Because the common components have been copied into three different namespaces, a redefine occurring in one namespace does not affect copies of the component in a different namespace. That's Saxon's interpretation, anyway. You could take the view that the "pervasiveness" of redefinition makes it transcend the renaming done by the chameleon include, but I don't. In experimenting further with this schema, I discovered that if the two imports from vac.xsd are reversed in order, the import of owms-classes-redef.xsd has no effect, because it is then importing a namespace that is already known to the processor; Saxon ignores the schemaLocation URI in this case. So the schema document owms-classes-redef.xsd, and the redefinitions that it contains, are simply ignored. This makes the situation very fragile: reversing the order of imports does not make the processing fail, it just silently compiles a different schema. I think this reinforces Henry's argument that if you're going to redefine, then there should be one redefining document for each namespace, which acts as a gateway to that namespace, and no other includes/imports/redefines from elsewhere in the schema should bypass this gateway. This schema breaks this rule, and gets away with it only because of gateway document is encountered before the bypassing document. Michael Kay Saxonica
Received on Friday, 16 October 2009 13:30:19 UTC