RE: Schema Design: Composition vs Subclassing

I like Jeni's suggested modeling of Person, with alternative extensions
based on the circumstances. In general, I've found derivation by extension
to easier to specify, explain, and maintain.

For the OAGIS case, though, it comes down to a question of combinatorics:
how many child elements are there, and how many different combinations of
required and optional elements are needed? 

An OAGIS Noun (a construct that carries much of a transaction's content) can
have many child elements and many possible combinations of required and
optional elements, depending upon the context of the Noun's use. E.g., in
one transaction, only the Noun's Id element is required and all other
content is optional, while in other cases, all of the Noun's child elements
are required, etc. This results in a large combination of  for "replicants"
(PersonForTransaction1, PersonForTransaction2, PersonForTransaction3,
PersonForTransaction4, ...),  each differing only in its child elements'
cardinalities.

Also to be considered is change: as more child elements are added to a Noun
over time, they must also be added to each replicant (in the same order),
something that can be daunting when there are large numbers of replicants
(PersonForTransaction1, PersonForTransaction2, PersonForTransaction3,
PersonForTransaction4, ...).

And then there's the issue of extensibility: when the users of the schema
need to extend Person (e.g., in another namespace), do they extend Person?
PersonForTransaction1?, ...? That could be fairly straightforward, but gets
more complicated with the use of groups (which reminds me, are groups
extensible? How does one do so?)

What we recognized is that each use of a particular Noun shares a common
structure (same child elements, same order), and that they only differ in
the cardinalities of their child elements. That's why we factored out the
cardinalities from the definition of structure: we define the structure of a
thing once, and we define the possible combinations of the cardinalities of
its parts separately. 

It would be interesting to consider how Schema could evolve to accommodate
that separation of concerns. 

Mark

----------------------------------------------------------------------------
----
 
Mark Feblowitz                                   [t] 617.715.7231
Frictionless Commerce Incorporated     [f] 617.495.0188 
XML Architect                                     [e]
mfeblowitz@frictionless.com
400 Technology Square, 9th Floor 
Cambridge, MA 02139 
www.frictionless.com  
 

 -----Original Message-----
From: 	Jeni Tennison [mailto:jeni@jenitennison.com] 
Sent:	Tuesday, April 16, 2002 12:00 PM
To:	Paul Kiel
Cc:	xmlschema-dev@w3.org
Subject:	Re: Schema Design: Composition vs Subclassing

Hi Paul,

> I have a question about the effect of this best practice. It is
> regarding context-specific use of components. Let's take the element
> <Person> in a human resources context. (BTW - <Person> is too broad
> a concept to actually encode, this is only for discussion). We may
> want to use a <Person> in many contexts, where some of its
> components are required in one and not in another. Let's say we have
> two transactions of Person below (and that all children are stand
> alone components):
>
> Transaction 1:
> <Person>
>      <Name/><!-- required -->
>      <Skills/><!-- required -->
>      <Height/><!-- required -->
>      <Weight/><!-- required -->
> </Person>
> In this transaction, we need all the data about this person to do
> the transaction.
>
> Transaction 2:
> <Person>
>      <Name/><!-- required -->
>      <Skills/><!-- optional -->
> </Person>
> In this transaction, we only need a name of the person and the
> skills are optional. The Height and Weight have no meaning in this
> context and can't occur.

None of the methods that you suggested seem particularly good to me.
In the example above, you have one thing that stays the same (<Person>
always has a <Name> element child), and two things that change
(whether <Skills> is required or optional, and whether the <Person>
includes a <Height> and <Weight>).

If you can take advantage of treating all Person elements in the same
way when it comes to their Name (i.e. that you can get some code reuse
out of it), I'd make a general PersonType that included a <Name>
element:

<xs:complexType name="PersonType" abstract="yes">
  <xs:sequence>
    <xs:element name="Name" type="xs:string" />
  </xs:sequence>
</xs:complexType>

I'd then create types that extend this base type. For Transaction 1:

<xs:complexType name="Transaction1PersonType">
  <xs:extension base="PersonType">
    <xs:sequence>
      <xs:element name="Skills" type="SkillsType" />
      <xs:element name="Height" type="xs:decimal" />
      <xs:element name="Weight" type="xs:decimal" />
    </xs:sequence>
  </xs:extension>
</xs:complexType>

<xs:complexType name="Transaction2PersonType">
  <xs:extension base="PersonType">
    <xs:sequence>
      <xs:element name="Skills" type="SkillsType" minOccurs="0" />
    </xs:sequence>
  </xs:extension>
</xs:complexType>

If you can't take advantage of the fact that the Person elements in
Transaction1 and Transaction2 are similar (i.e. for some reason you
can't share code between them) then you could design through
composition instead. This time the shared components should go into
groups:

<xs:group name="NameGroup">
  <xs:sequence>
    <xs:element name="Name" type="xs:string" />
  </xs:sequence>
</xs:group>

<xs:group name="SkillsGroup">
  <xs:sequence>
    <xs:element name="Skills" type="SkillsType" />
  </xs:sequence>
</xs:group>

<xs:group name="HeightAndWeightGroup">
  <xs:sequence>
    <xs:element name="Height" type="xs:decimal" />
    <xs:element name="Weight" type="xs:decimal" />
  </xs:sequence>
</xs:group>

Then you could have two (possibly anonymous) types that bring those
groups together as required:

<xs:complexType name="Transaction1PersonType">
  <xs:sequence>
    <xs:group ref="NameGroup" />
    <xs:group ref="SkillsGroup" />
    <xs:group ref="HeightAndWeightGroup" />
  </xs:sequence>
</xs:complexType>

<xs:complexType name="Transaction2PersonType">
  <xs:sequence>
    <xs:group ref="NameGroup" />
    <xs:group ref="SkillsGroup" minOccurs="0" />
  </xs:sequence>
</xs:complexType>

A third possibility would be to have an abstract version of the
PersonType that includes a group with nothing in it as a placeholder:

<xs:complexType name="PersonType" abstract="yes">
  <xs:sequence>
    <xs:element name="Name" type="xs:string" />
    <xs:group ref="PersonGroup" />
  </xs:sequence>
</xs:complexType>

<xs:group name="PersonGroup">
  <xs:sequence />
</xs:group>

Then, in the schema for Transaction 1, you can redefine the
PersonGroup group to add the required elements:

<xs:redefine href="baseSchema">
  <xs:group name="PersonGroup">
    <xs:sequence>
      <xs:group ref="PersonGroup" />
      <xs:element name="Skills" type="SkillsType" />
      <xs:element name="Height" type="xs:decimal" />
      <xs:element name="Weight" type="xs:decimal" />
    </xs:sequence>
  </xs:group>
</xs:redefine>

and similarly in the schema for Transaction 2:

<xs:redefine href="baseSchema">
  <xs:group name="PersonGroup">
    <xs:sequence>
      <xs:group ref="PersonGroup" />
      <xs:element name="Skills" type="SkillsType" minOccurs="0" />
    </xs:sequence>
  </xs:group>
</xs:redefine>

Personally, I think I'd favour 1, but I'm in the process of being
persuaded towards composition, so I reserve the right to change my
mind.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/

Received on Tuesday, 16 April 2002 13:13:10 UTC