Re: XML Schema Question

Hi Noah,

> I agree, for the most part. On the other hand, I think you have
> perhaps underemphasized fidelity of the markup to the underlying
> semantics of the information to be processed. Thus, independent of
> schema language, one can argue that:
>
> <list>
>   <item>val1</item>
>   <item>val2</item>
>   <item>val3</item>
> </list>
>
> is more robust than:
>
> <list>val1 val2 val3</list>
>
> Tools like XQuery/XSL will have an easier time picking up the
> explicit markup of the more verbose form.

Naturally, I agree that XML-based markup is generally to be favoured
over text-based format (i.e. if there's no other reason -- human
readability, size etc. -- to pick one or the other, pick the XML-based
markup).

> Though I can see it either way, one can make the case in that spirit
> for the following "improvement" to the example in question:
[snip]

I think this design choice (wrapper elements or not) is pretty
different from the XML-or-text design choice. Why do you regard the
wrapper-element version as better? There's no real difference in terms
of XQuery/XSLT accessibility...

I guess I don't really understand your "fidelity of the underlying
semantics" argument. Is it that you think it's important that the XML
represents the fact that fathers and mothers are parents, sons and
daughters are children and pets are... pets?

My favoured "improved design" would simply dictate the order in which
the elements must occur. I'm reminded, once again, of Tommie Usdin's
Extreme 2002 paper on when order matters [1].

> Though I'm sure you would get differing opinions from various
> members of the XML Schema WG, I think it's fair to say that among
> the reasons we did not invest more in the "unordered sequences of
> elements with count controls on each element by name"-model was a
> feeling that in the majority of cases such markup is in fact
> undesirable. We could have extended <xsd:all> to support this, but
> chose not to in release 1.0. We do occasionally get requests to add
> such function, but some of us remain unconvinced that it is a high
> priority. You'll note that XML Schema can easily model my "improved"
> form, either using <xsd:sequence> (order of the sections matter) or
> <xsd:all> (if it doesn't).

Sure. No schema language can cover all the angles. It's completely
reasonable for the designers of a schema language to choose to focus
on a particular (sub)set of requirements and saying "this schema
language is designed to support this subset of markup languages".
Deciding what to support and what not to support is an essential part
of design.

It seems like you're basing your requirements on the answers to
"should new markup languages do this?" or "should well-designed markup
languages do this?". Of course those are fascinating questions --
constructing theories of markup languages is much more interesting
than actually surveying practice -- but it does mean that XML Schema
is somewhat less useful, in practice, than it should be in theory.

Personally, I'd be asking "do existing markup languages do this?". You
don't have to look very far to find examples of existing markup
languages that have unordered elements with occurrence constraints:
XHTML's <head> element has a content model that (expressed in RELAX NG
Compact Syntax) looks like:

  title & base? & meta* & link*

I'd also be looking at "do users want to do this?". Given the number
of times we see questions on unordered content with occurrence
constraints, I'd judge the answer is yes.

(The same argument applies to co-occurrence constraints: it might not
be the best design choice to have the legal content or attributes of
an element, or values of another attribute, depend on the value of an
attribute, but it's a fairly prevalent design in existing markup
languages, including XML Schema!)

Cheers,

Jeni

[1] http://www.mulberrytech.com/Extreme/Proceedings/html/2002/Usdin01/EML2002Usdin01.html

---
Jeni Tennison
http://www.jenitennison.com/

Received on Thursday, 26 August 2004 13:36:00 UTC