RE: XML Schemas patterns (was: Re: Defining recursive elements?)

Michael Kay writes:

> But it was written before anyone had any awareness of the impact on
> schema-aware queries and stylesheets. This changes the rules, 
> for example it
> becomes much more important to define global elements and types
> so that you
> can use their names in function signatures.

Yes, and at the risk of being controversial, I'll go further:  I think 
local element declarations have been oversold in XML schema.  In part, 
this is because we made what is a mistake in my opinion, which is to make 
it convenient syntactically to define local elements, and somewhat 
clumsier to use global ones.  Because the form:

  <element name="outer">
    <sequence>
        <element name="inner1" type="t1"/>
        <element name="inner2" type="t2"/>
    </sequence>
  </element>
 
is convenient, obvious, and isomorphic to the instances it validates, 
people use it.  Local elements also appear in early examples in the 
primer, so people think they're the obvious way to do things.   I believe 
that all this is to some extent a mistake.    Local element declarations 
are essential in the particular case where you need conflicting 
declarations for the same named element according to context.  When that's 
what you need, use locals.  Otherwise, I think globals are more robust, 
for the following reasons, among others:

* It's a single uniform model: even when you want to use locals, you need 
a global to wrap them in.  That makes your schema asymmetric, with some 
globals and some locals.  Using all globals means all elements are defined 
using the same constructs.
* You can begin validation from any global element, which means that you 
can validate incrementally when editing pieces of documents.
* XML vocabularies are often designed to be reusable across documents. You 
can share references to globals, not locals.
* The whole business of elementForm and elementFormDefault doesn't come 
up.
* As Michael says, you have a first class construct for use in function 
signatures, etc.

Global element declarations are the ones that correspond most closely  to 
what you get with DTDs, and thus are reasonably well understood in their 
implications.  Historically, locals were added to the schema language 
primarily to allow for mappings of existing programming language 
structures that might themselves provide local scoping (the same named 
property x as an int or a float according to the struct in which it 
appears.)  For those cases, fine, use them. 

I think the schema language looks a lot simpler, conceptually if not 
syntactically,  if you start by forgetting about locals.  Don't learn 
them, don't use them.  You'll find a language that's easier to teach and 
easier to learn.   You can always learn about locals in the rare cases you 
need them.  (Or, you can learn about locals so you can understand the 98% 
of schemas or whatever it is that use them unnecessarily.)  In retrospect, 
I would have preferred if the syntax above created globals for inner1 and 
inner2, with some override needed to cause local scoping.

Noah

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Thursday, 17 May 2007 21:04:02 UTC