Re: Classroom critique of XML Schema 1.1

On Mar 18, 2011, at 1:10 PM, Costello, Roger L. wrote:

> Hi Folks,
> 
> This week I taught a 3-day intensive on XML Schema 1.1 to a group of top-notch software, security, and information engineers.

Thanks,  Roger!

> The consensus of the class was that XML Schema 1.1 has problems that should be fixed before it goes to Recommendation status.

It may be difficult to make major changes at this stage in the development of
the spec.  The WG has been laboring on this for quite a long time by W3C
measures, and there is a certain impatience both within the working group
and within the W3C membership to get 1.1 finished and out.

But comments are always welcome.

> 
> 1. XML Schema 1.1 has a restriction that the XPath expression in an <assert> element can only "look down." It cannot reference items higher up in the XML tree, nor can it reference items in other XML documents. However, an attribute that is declared higher up in the XML tree can be referenced by the <assert> element if it is declared inheritable (inheritable="true"). During discussions it became apparent that inheritable attributes were created as an end-run around the restriction that the XPath expression in <assert> elements can't "look up." 
> 
> Recommendation: Lift the restriction that the XPath expression in <assert> elements can only "look down." Permit the XPath expression to look anywhere, including to other XML documents. 

It's clear that such a change would be convenient for some cases; I
argued for that way of doing things, myself.  There are two reasons
the WG did not agree with me, and I expect those reasons will still
be persuasive enough to make it hard to generate consensus for this
change:

1 In XSD 1.0 and in XSD 1.1 apart from assertions, type validity has
the property that it depends upon an element and a type, and does
not depend at all upon the context of the element.  So it's always possible
to validate an element against a type by extracting the element from 
the document and handing it to a validator.

This context-free nature of type validity was an important design
goal for some members of the WG, though not for all.

It is also the case that pretty much every example the WG has
encountered where it felt natural to express a constraint by looking
up could also be expressed in a downward-looking assertion on an
ancestor element.  For example, the constraint that an XHTML
'input' element can only be used within a 'form' element can easily
be expressed on the 'input' element -- but it can also be expressed
as a downward-looking constraint on the HTML or the 'body' element.

So although the restriction may feel a bit inconvenient, it does not
in fact seem to have as big an impact on what's expressible as
one might expect (or to be concrete:  as big an impact as I expected).

2 The really decisive point is that the type system and formal semantics 
of XPath 2.0, XSLT 2.0, and XQuery 1.0 depend (I am told) on the
property that for any element E of type T, if you put element E
into some context, it will still have type T, regardless of what context
you put it into.   But if type validity were context-dependent, putting
E into an unsuitable context might easily render it invalid -- which
means that it would no longer be of type T.

I don't know of any programming language in which membership
in a type is context-dependent and needs to be re-checked every
time things are combined into new structures, so I can't say I think
that this particular aspect of the QT specs is an arbitrary restriction.

Since the XSLT and XML Query working groups were meeting just
down the hall when we debated this, the threat of being invaded 
by a howling mob carrying pitchforks, torches, hot tar, and lots
of chicken feathers seemed like quite a real one. 

> 
> 
> 2. The attributes identified by defaultAttributes only applies to that schema file, not to imported or included schema files. That rule seems reasonable for imported schema files but not for included schema files. A schema file and its included schema files are really all part of the same schema.
> 
> Recommendation: Add an attribute on <include> that indicates that the included schema file has the attributes identified by defaultAttributes in the including schema file:
> 
> <include schemaLocation="..."
>         defaultAttributes="..." />

The meaning of an include element is given in the spec in terms
of the schemas corresponding to the including schema document
and to the included schema document.  If document D1 has an
include element pointing to schema document D2, the meaning
is, in essence, that schema(D1) includes schema(D2) as a subset.

That formulation, in turn, makes sense only if an expression like
schema(D2) is a determinate quantity that can be calculated
independently of other things.

Other ways of describing schema construction would be logically
possible, of course -- it's not the case that changing this 
formulation would lead immediately to a logical contradiction.
But it might easily lead to compatibility issues with XSD 1.0,
which would make it a very hard sell among implementors
and some users.

The topic of schema composition has been difficult for the WG, and
after spending a lot of effort in an attempt to solve a number of
difficult bugs in the area and to provide firmer foundations for
the rules of schema construction, the WG reluctantly decided
some time ago that if we tried to do that, we would never succeed
in finishing XSD 1.1.  There just seemed to be no consensus and
no prospect of consensus.

My personal estimate is that if we tried to make the completion of
XSD 1.1 dependent on a revision of the section 4 of the Structures
spec (which deals with schema construction and composition), and
the WG went into the effort with fairly strong consensus on how we
wanted to go about the revision, we could count on six to twelve 
months of hard technical work.  Since we don't have that consensus,
the actual time would be longer than that -- double?  treble?
We don't have six to twelve months, let alone two to three years; 
we have to finish very soon.

On the other hand, it might be feasible to make the xsd:override
construct handle this use case.  That may be worth proposing
to the WG by opening a Bugzilla entry for the suggestion.  (No
promises, mind you.)

> 
> 3. If an element has multiple inheritable attributes in its ancestors and they have the same name, only the closest one will be visible to the element. For example, suppose the <Comment> element has two xml:lang attributes (both inheritable) that are ancestors:
> 
> <Document xml:lang="FR">
>    <Chapter>
>        <Section xml:lang="EN">
>            <Comment>
> 
> An XPath expression in an <assert> element on <Comment> can only use the closest xml:lang attribute (xml:lang="EN"). It cannot use the xml:lang attribute that is higher up (xml:lang="FR").
> 
> 
> Recommendation: Allow XPath expressions in <assert> and <alternative> elements to use any inheritable attribute in its ancestors.
> 

The XML spec says that the value of the xml:lang attribute is 
inherited.  So the implied value of xml:lang on the Comment element 
is EN.  It's not FR -- that's not what inheritance means.  Having
an expression like @xml:lang='FR' evaluate to true would not
be convenient; it would just be wrong.

Sorry, here I just think your class got the wrong end of the stick.

-- 
****************************************************************
* C. M. Sperberg-McQueen, Black Mesa Technologies LLC
* http://www.blackmesatech.com 
* http://cmsmcq.com/mib                 
* http://balisage.net
****************************************************************

Received on Saturday, 19 March 2011 01:14:25 UTC