Re: Wrapped around the axle: Refining and extending structures across namespaces

Hi Mark,

> The two major difficulties I encounter are 1) keeping the refinement
> and extension appropriately separate and 2) keeping namespace
> prefixing in the instance document simple and manageable.

You raise some very interesting issues. In particular, this
demonstrates the tension between wanting to maintain a neat type
hierarchy and wanting to validate neat instance documents.

My view is that your first priority should be getting the instance
document to look the way that you want it to look. Most applications
that deal with XML do not care about the schema and the neat type
hierarchy that you might use within it. They do care, very much, about
the namespaces of the elements in the document.

The instance document:

<Document xmlns="ns2" xmlns:ns1="ns1"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="ns2
                              RefineExtend-NS2-3.xsd">
  <Header>
    <ns1:CreationDate>1967-08-13</ns1:CreationDate>
    <LastModifiedDate>1967-08-13</LastModifiedDate>
  </Header>
  <Line>
    <ns1:LineNumber>2</ns1:LineNumber>
  </Line>
  <Footer>Text</Footer>
</Document>

Is very different, to any namespace-aware processor from, for example:

<Document xmlns:ns2="ns2" xmlns="ns1"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="ns2
                              RefineExtend-NS2-3.xsd">
  <Header>
    <CreationDate>1967-08-13</CreationDate>
    <ns2:LastModifiedDate>1967-08-13</ns2:LastModifiedDate>
  </Header>
  <Line>
    <LineNumber>2</LineNumber>
  </Line>
  <ns2:Footer>Text</ns2:Footer>
</Document>

In most cases, I'd imagine that you'd like a processor that could deal
with documents described by the ns1 schema to be able to process the
instance document that includes the ns2 schema. That means, I think,
that the second instance document is the one that you'll be aiming
for.

Once you've decided what you want your instance document to look like,
your choices are clearer.

First option: since you cannot redefine the content of an element in
the ns1 namespace from within a schema whose target location is the
ns2 namespace, if you want to change the possible content of the
Document and Header elements, you can create an 'adapter' schema with
ns1 as the target namespace. Within the adapter schema, you can
redefine the basic ns1 schema, extending the various types as
required. So you can do:

<xs:schema targetNamespace="ns1" xmlns="ns1" xmlns:ns2="ns2"
           xmlns:xs="http://www.w3.org/2001/XMLSchema"
           elementFormDefault="qualified">

<xs:import namespace="ns2"
           schemaLocation="RefineExtend-NS2.xsd" />
           
<xs:redefine schemaLocation="RefineExtend-NS1.xsd">

  <xs:complexType name="DocumentType">
    <xs:complexContent>
      <xs:extension base="DocumentType">
        <xs:sequence>
          <xs:element ref="ns2:Footer" />
        </xs:sequence>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <xs:complexType name="HeaderType">
    <xs:complexContent>
      <xs:extension base="HeaderType">
        <xs:sequence>
          <xs:element ref="ns2:LastModifiedDate" />
        </xs:sequence>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

</xs:redefine>
           
</xs:schema>

The schema for ns2 then simply declares the Footer and
LastModifiedDate elements.

In a way this *does* maintain the 'neat type hierarchy' - the type of
the Document used in documents containing ns2 is an extension of the
type of the Document used in documents that only contain ns1.

Second option: you can implicitly 'extend' the content of an element
if you place within it a placeholder for the extensions. If you don't
need a lot of control, you can do so with wildcards; if you want more
control, you can use substitution groups. For example, in the schema
for ns1, you could use abstract element declarations to act as
placeholders for extended content:

<xs:schema targetNamespace="ns1" xmlns="ns1"
           xmlns:xs="http://www.w3.org/2001/XMLSchema"
          elementFormDefault="qualified">
          
<xs:element name="Document" type="DocumentType"/>

<xs:complexType name="DocumentType">
  <xs:sequence>
    <xs:element name="Header" type="HeaderType"/>
    <xs:element name="Line" type="LineType" maxOccurs="unbounded"/>
    <xs:element ref="DocumentExtension"
                minOccurs="0" maxOccurs="unbounded" />
  </xs:sequence>
</xs:complexType>

<xs:element name="DocumentExtension" abstract="true" />

<xs:complexType name="HeaderType">
  <xs:sequence>
    <xs:element name="CreationDate" type="xs:date"/>
    <xs:element ref="HeaderExtension"
                minOccurs="0" maxOccurs="unbounded" />
  </xs:sequence>
</xs:complexType>

<xs:element name="HeaderExtension" abstract="true" />

<xs:complexType name="LineType">
  <xs:sequence>
    <xs:element name="LineNumber" type="xs:positiveInteger"/>
  </xs:sequence>
</xs:complexType>

</xs:schema>

In the schema for ns2, you can then declare the elements that get
added to the types in ns1 as belonging to the relevant substitution
groups:

<xs:element name="Footer"
            substitutionGroup="ns1:DocumentExtension"
            ... />
<xs:element name="LastModifiedDate"
            substitutionGroup="ns1:HeaderExtension"
            ... />

This has the advantage that you don't have to make new 'adapter'
schemas every time you want to add new elements to the content models.

Note that neither of these methods need to use restriction to force
Document elements to hold Header elements with an extended type.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/

Received on Wednesday, 19 December 2001 18:48:26 UTC