Schema Design: Composition vs Subclassing

[Curt, I vaguely recall you having some thoughts on this topic a long
time ago.  Please chime in.]

As I sit here at my desk analyzing a schema with a huge hierarchy chain,
I seriously begin to question the value of schema type hierarchies,
especially schemas containing hierarchies with many levels. I ponder
ways to break the chain and simplify the schema.  I envision a schema
design whereby independent, decoupled components are simply assembled
together.  

It dawns on me that this is the old Object-Oriented issue of
design-by-subclassing versus design-by-composition, now rearing its head
in the design of XML Schemas.  Let's consider these two design
approaches as they apply to XML Schemas.

Let's compare these two design approaches:
 . design-by-subclassing (i.e., type hierarchies) 
      versus 
 . design-by-composition (i.e., bundling together element groups).

---------------------------------------------------------------
** Design-by-subclassing **

To compare design approaches consider this type hierarchy:

<xsd:complexType name="C1">
    <xsd:sequence>
        <xsd:element name="E1" type="..."/>
        <xsd:element name="E2" type="..."/>
        <xsd:element name="E3" type="..."/>
    </xsd:sequence>
</xsd:complexType>

<xsd:complexType name="C2">
    <xsd:complexContent>
        <xsd:extension base="C1">
            <xsd:sequence>
                <xsd:element name="E4" type="..."/>
            </xsd:sequence>
        </xsd:extension>
    </xsd:complexContent>
</xsd:complexType>

<xsd:complexType name="C3">
    <xsd:complexContent>
        <xsd:extension base="C2">
            <xsd:sequence>
                <xsd:element name="E5" type="..."/>
                <xsd:element name="E6" type="..."/>
            </xsd:sequence>
        </xsd:extension>
    </xsd:complexContent>
</xsd:complexType>

<xsd:element name="root" type="C3"/>

Here we see that that the <root> element is of type C3.  C3 extends type
C2, so to understand type C3 you must understand C2.  But to understand
C2 you must understand type C1.  Already it is getting very difficult to
understand the <root> element (and this is a short hierarchy).  Further,
if any type along the hierarchy changes (i.e., we add a new element
and/or delete an element) then everything under it breaks.  

CONCLUSION

Design-by-subclassing yields highly coupled, brittle schemas.

---------------------------------------------------------------
** Design-by-composition **

Let's contrast the above design approach with a composition design.  In
this approach we create independent (off-the-shelf) group components.
The <root> element is declared by simply assembling together the desired
components:

<xsd:group name="G1">
    <xsd:sequence>
        <xsd:element name="E1" type="..."/>
        <xsd:element name="E2" type="..."/>
        <xsd:element name="E3" type="..."/>
    </xsd:sequence>
</xsd:complexType>

<xsd:group name="G2">
    <xsd:sequence>
        <xsd:element name="E4" type="..."/>
    </xsd:sequence>
</xsd:complexType>

<xsd:group name="G3">
    <xsd:sequence>
        <xsd:element name="E5" type="..."/>
        <xsd:element name="E62" type="..."/>
    </xsd:sequence>
</xsd:complexType>

<xsd:element name="root">
    <xsd:complexType>
        <xsd:sequence>
            <xsd:group ref="G1"/>
            <xsd:group ref="G2"/>
            <xsd:group ref="G3"/>
        </xsd:sequence>
    </xsd:complexType>
</xsd:element>

Again, as we see, the creation of the <root> element is simply a matter
of assembling together the desired pieces.

With this approach: 
 . it is much easier to understand the the schema since you can 
   focus on each component one at a time,
 . each component is independent amd decoupled.  Any changes to 
   one component will not impact the other components.  

CONCLUSION

Design-by-composition yields scalable, robust schemas.

---------------------------------------------------------------
What are your thoughts on this?  /Roger

Received on Tuesday, 2 April 2002 17:24:25 UTC