Re: Classroom critique of XML Schema 1.1

On Mar 19, 2011, at 9:44 AM, Costello, Roger L. wrote:
> 
>> This context-free nature of type validity was an important design
>> goal for some members of the WG, though not for all.
> 
> Hold on! XML Schema 1.1 elements are not context-free. 
> 
> Example: in my class I gave this example: What is the data type of this <Publisher> element:
> 
>    <element name="Publisher" type="string" />
> 
> Without considering context, you would answer "string."

And if one is using "What is the data type" as a paraphrase of the
question "what is the governing type definition"? one would be
right.

> However, if you look up the tree to the <BarnesAndNoble> ancestor element you see that it contains an <assert> element which constrains the string data type:
> 
>    <assert test="string-length(.//Publisher le 140)" />
> 
> "The length of the value of the <Publisher> element must be less than 140 characters."
> 
> Thus the element declaration for Publisher is not context-free. Its type is very much dependent on its context.

Well, here we hit a distinction that can make a difference, in some cases.

The type assigned to an element of a given name can certainly depend
on context; there is a well understood technique for handling limited
context dependency in XSD that has been documented for about ten years
now, and which has a close analogue in Relax NG.

And as your example illustrates, the nature of assertions can mean that
types other than the type of the element being discussed may affect the
set of values actually seen in valid instances.

But there is a difference between saying that a Publisher element with
a 200-character value will never appear in a valid instance of the
BarnesAndNoble element, and saying that such a Publisher element
would be invalid against its governing element declaration or its governing
type definition.  If that difference doesn't make any difference in a 
particular operating environment, it will seem like a purely academic
distinction.  But XSD differs from some other schema languages in
being designed to produce more than one bit of information, more than
a yes/no answer for the document as a whole.  And so for XSD the
difference between a BarnesAndNoble element marked invalid because
one of its assertions is violated, and a Publisher element marked
invalid for some reason (let's say, because it has child elements or
undeclared attributes) is an important one.

In XSD, the word 'governing type definition' has a particular meaning; it is
not just a synonym for "the set of all constraints that may bear upon an
element."

> ...
> 2.	The XML document was redesigned to get around the limitations of XML Schema 1.1.

Or, phrased in a different way:  the XML vocabulary was designed to work 
well with a particular tool, rather than fighting it.  

That shouldn't be surprising:  Anyone using a tool is apt to find that the tool 
contributes to shaping their thinking.  Sometimes we think the influence is
beneficent, and sometimes not.  When we like the tool, we tend not to mind
its influence on our thinking.

> Rather than sticking to the role of just validating XML designs, XML Schema 1.1 moves into the role of dictating XML designs.

Any schema language makes some constructs easier to use and (by 
comparison if in no other way) other constructs less convenient to use.
XSD 1.1 is no exception.  I think the word "dictating" is a little strong
here:  XSD 1.1 supports a very wide range of design styles.  But 
certainly it supports some designs better than others.  And in some 
cases the question "should we be making this design pattern easy to
use or not?" was an explicit topic of discussion in the development of the
spec.  If you find a schema language in which questions of that form
were not an explicit topic of discussion, you'll find a schema language
whose designers were not thinking carefully about what they were doing,
or were ill informed about the variety of possibilities in XML applications.

> 
> Let's recap some key points:
> 
> A.	There is universal agreement that <assert> constraints should be positioned right with the elements they apply to.

Your earlier formulation was that assertions should go "where they are
needed."  Those really aren't the same thing.  And neither formulation
provides unambiguous assistance answering the question:  does an
additional constraint (not part of the declared type) on a descendant
belong with the descendant or with the ancestor?  Your formulation
here asks "does it apply to the descendant or to the ancestor?"  It seems
obvious to me that the BarnesAndNoble rule you formulate can be said
to apply both to the BarnesAndNoble element and to the Publisher
element.  So how does the question help me decide where to put the
assertion?

> ...
> In XPath, I can establish a context node anywhere in the tree and navigate along any axis, including the ancestor axis. Furthermore, any " data type" information that XPath, XSLT 2.0, or XQuery gets is from XML Schema, so it is XML Schema that is driving this data type train, not XPath, XSLT 2.0, or XQuery. 

If by the last sentence you mean "so the XML Schema WG has no need to
pay attention to what the XSLT and XML Query WGs think; you determine
what a type is, and they should just live with it", I can only say that it's
more complicated than that, and rightly so.  The QT specs are not only 
prominent users of XSD and thus 'customers' the XSD spec should try
to keep happy, but the technologies need to work harmoniously 
together for the sake of users.  It's for that reason that the WGs are
chartered to work together and that the WG responsible for the XSD 
spec rightly took the situation of the QT specs into consideration in 
making our decisions about assertions.   Collaboration among WGs is
not always peaceful or easy, and this particular collaboration has seen its
share, or maybe more than its share, of acrid conflicts over the years.
It would not do users any favor if the schema, query, and XSLT wgs
were to get into a pissing match over who is driving the train and who is
just along for the ride.

-- 
****************************************************************
* C. M. Sperberg-McQueen, Black Mesa Technologies LLC
* http://www.blackmesatech.com 
* http://cmsmcq.com/mib                 
* http://balisage.net
****************************************************************

Received on Sunday, 20 March 2011 14:24:49 UTC