Re: Schema Design: Composition vs Subclassing from Jeni Tennison on 2002-04-17 (xmlschema-dev@w3.org from April 2002)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Wed, 17 Apr 2002 12:11:11 +0100
To: Mark Feblowitz <mfeblowitz@frictionless.com>
CC: Paul Kiel <paul@hr-xml.org>, xmlschema-dev@w3.org
Message-ID: <21309044776.20020417121111@jenitennison.com>
Hi Mark,

> What we recognized is that each use of a particular Noun shares a
> common structure (same child elements, same order), and that they
> only differ in the cardinalities of their child elements. That's why
> we factored out the cardinalities from the definition of structure:
> we define the structure of a thing once, and we define the possible
> combinations of the cardinalities of its parts separately.

This is extremely interesting. You're making a distinction between:

  - the order in which elements appear
  - the number of times those elements appear

Traditionally, content models have combined these two factors. When a
validator goes through an instance it checks the number of occurrences
of an element at the same time as checking that they're appearing in
the correct order.

But you could imagine a content model that expressed the two factors
independently. For example, given a DTD content model of:

  (E1, E2?, E3+, E4)

you could have one part that said that E1, E2, E3 and E4 must appear
in order, followed by a second part that said that E2 was optional and
E3 had to occur one or more times. Making up a syntax:

  (E1, E2, E3, E4) & E2? & E3+

A validator could feasibly go through an element's content twice, once
checking order and once checking cardinality, or it could combine the
two parts to create a traditional content model.

Separating out these two parts would enable you to vary one while
keeping the other the same. So for example you could say that all
Nouns shared the same order of elements and swap in different
cardinality constraints as and when required.

As far as I know, the only schema technology that enables you to make
that kind of division explicitly is Schematron -- you can have one set
of rules that test ordering, and another set of rules that test
cardinality. When RELAX NG had 'concur' you could have used that
(essentially imposing two overlapping sets of constraints on the
content model); you could still use it with TREX I suppose.

But this is essentially what you're doing -- using XML Schema to
provide the ordering constraints (which means that you have to be very
flexible about the cardinality, essentially not testing it or testing
it within known limits) and another technology to provide concurrent
validation of the cardinality constraints.

This is interesting in terms of schema language development because it
implies that something like 'concur' would be a useful addition to the
language. You could imagine it solving some of the problems that
people have with "these elements, in any order, with these
cardinalities" as well.

It's also interesting in terms of the relationship with OO
technologies. In OO technologies, ordering isn't an issue, only
cardinality, so the normal "best practice" for OO probably isn't going
to help here. Design patterns for OO technologies simply don't have to
deal with this kind of issue.

Anyway, after all that, to answer the non-rhetorical question in your
post:

> (which reminds me, are groups extensible? How does one do so?)

They're only extensible (while retaining the same name) when you
redefine schema. When you use xs:redefine, you can change model groups
as long as either:

  - you reference the model group within the new model group
    (essentially allowing you to extend it) or

  - the redefined model group is a valid restriction of the original
    model group (essentially allowing you to restrict it)

Of course you can reference model groups with different names wherever
you like within a model group -- that's equivalent to extending
complex types within a single schema.
    
Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/
Received on Wednesday, 17 April 2002 07:11:13 UTC