RE: XSchema integration, responsiveness, and a good solution to the problem from Jonathan Robie on 2002-10-18 (public-qt-comments@w3.org from October 2002)

From: Jonathan Robie <jonathan.robie@datadirect-technologies.com>
Date: Fri, 18 Oct 2002 15:34:03 -0400
To: "Kay, Michael" <Michael.Kay@softwareag.com>, Tim Bray <tbray@textuality.com>, public-qt-comments@w3.org
Message-Id: <5.1.0.14.0.20021017100157.05656cb0@ncmail.datadirect-technologies.com>
Tim Bray wrote:

>TB 1. Maximalism
>
>The family of XML Query specification makes no visible effort to hit an
>80/20 point.  It is trying very hard to stake out COMPLETE solution in the
>XML query space, which is rather courageous given the profound lack of
>industry experience.

XQuery is no longer a small language, but it is not a large language by 
today's standards, and we have certainly rejected a number of proposed 
features, and removed features that we once had. The BNF for XQuery has 89 
non-terminals, and most of these productions are syntax of XPath or XML. 
XQuery as a whole, if you omit the support for complex types, is probably 
not much bigger than XSLT + XPath.

The scope of the current XQuery was determined by the use cases, which were 
first published in February 2001 [1]. Most of this functionality is implied 
by the requirements in our original requirements document, which was first 
published in January 2000 [2]. There was strong agreement in the Working 
Group to accept these use cases, and we have been proceeding on this basis 
ever since then. The only major functionality added to the use cases since 
then is support for W3C XML Schema - we now have a use case that 
illustrates this functionality [3], but our original requirements already 
said we needed to support the simple and complex types of XML Schema.

I don't think we're trying to be the complete solution in the query space - 
we are leaving obvious functionality out of Version 1.0, probably including 
full-text search and updates. We have removed functionality that we 
previously had, including the dereference operator and support for tree 
projection. We have also resisted adding explicit syntax for joins and 
grouping.

And I don't think we have experienced as much feature creep as I have seen 
in some other Working Groups. The functionality we have now is actually 
pretty similar to the functionality of the original Quilt proposal, except 
for support for typing.

In your latest message on the XML Query Comments list, you indicate that 
your preferred solution would be to ship Basic XQuery, which basically 
leaves out support for W3C XML Schema complex types. I am assuming that 
most of the complexity you are talking about here involves support for 
complex types and the static semantics associated with this. Is that true?

>The immense amount of work that has gone into this specification would have
>a much higher chance of a positive impact on the world if the features and
>functions provided in XQuery were reduced by a huge factor, cutting back at
>least to XPath 1.0's level of semantic richness.

XQuery and XPath 1.0 have very different use cases. You can see this by 
looking at our use case document and trying to solve the use cases with 
XPath 1.0. Currently, people often wind up using XPath + SQL + XSLT + Java 
+ DOM to solve these kinds of queries. XQuery simplifies their lives by 
letting them do this with one language, using one type system.

We have been very careful to publish our requirements and use cases very 
early in the life of the Working Group. Changing our requirements and use 
cases, which have both been published for quite a long time, and have had 
the consensus of the Working Group for all this time, would be extremely 
disruptive. As an AC Rep, I would not like to have my company participate 
in Working Groups under the understanding that the requirements may be 
changed years after consensus is achieved and the requirements are posted.

Michael Kay wrote:

>It's not true that there's a lack of industry experience. Database
>technology is mature and well understood, and the requirements on query
>languages, both from a user perspective and an implementation perspective,
>are well established. The user community is sufficiently mature that a
>minimal solution without the features they expect in database languages
>would not be well received. Many of the companies participating in the
>exercise, including my own, have been in the database business for many
>years and cannot be accused of not understanding the user requirements or
>the technology. OK: applying these ideas to XML is relatively new, but my
>company has had an XML database in the field for 3 years and our users are
>not slow to tell us about missing features.

And our Working Group certainly has its share of well-known experts in the 
database and programming language communities.

>Furthermore, this specification's size and complexity make it inevitable
>that its arrival will be delayed by amounts of time that seem unreasonable
>to those on the outside looking in.  This will cause problems because
>vendors who need this functionality will release software based on unstable
>drafts, creating a combination of conversion and interoperability problems
>down the road.
>
>MK: yes, this is certainly a problem. Software AG is planning to release an
>implementation this year because we can't afford to wait any longer.
>Unfortunately, it's difficult to show that a radical change in approach is
>likely to lead to faster delivery.

I agree. People want XQuery to be released quickly. The spec is not 
finished. I think we are best off waiting until the spec is mature before 
we release it.

Much of the size of the XQuery document involves the detail given to 
implementation details, such as the grammar and lexical analysis.


>TB:
>The size and complexity also ensure that when XQuery 1.0 finally arrives, it
>will be well-populated with bugs, some of which will be highly injurious to
>interoperability.
>
>MK: that's certainly a risk. But part of the reason it is taking so long is
>that we are being very thorough.

Yes - we have done a great deal to make implementation possible. For our 
grammar, we publish a working parser, with source code and an interactive 
web page which can provide the parse tree for XQuery expressions. In our 
specification, I think we have given a great deal of effort to make our 
semantics explicit. The Formal Semantics are designed to make 
implementation easier for those who support strong static typing in their 
environments. We have also provided Basic XQuery, which is much less 
complex than implementations that include schema import or static typing.

We are not done yet, but by the time we are done, I do predict that the 
spec will be nailed down much more precisely than other specs I have worked 
on in the W3C.

 > TB 2. Spec Suite organization

>There needs to be an overview somewhere, a starting point, mostly tutorial
>in nature, that explains the relationships between XQuery, the data model,
>the use cases, the functions and operators, and XPath 2. Having read all of
>them at least in part, I remain fairly puzzled as to how they're supposed to
>fit together.
>
>MK: I agree with you entirely that the document set is poorly structured. It
>is designed for the convenience of the authors, not of the readers. This is
>a difficult problem to fix, because it isn't always possible to allocate
>work to editors in an optimum way, but I personally think we should try.

I also agree here. We have added this to the agenda of the Editor's list - 
a reorganization can't be done in the next publication, but there may well 
be a reorganization in the publication that follows.

>TB 3. Function of the "Data Model" and "Formal Semantics"
>TB 4. Overlapping material

I think this is largely the same point you made earlier. Reorganization and 
rethinking is very much in order.

The Formal Semantics document appeals to a particular audience - some 
implementors find it very useful, others do not.  The static typing of 
XQuery will not be well-specified without it, but the rest of the language 
should stand on its own, using the XQuery specification.

>TB 6. XML Schema Data Types and Duration
>
>The reliance on XML Schema basic types seems well-thought-through, although
>the comprehensibility and ease of implementation of XQuery would be greatly
>increased by dropping support for some number of XSD basic types, without,
>it seems, much serious loss of functionality.
>
>MK: I think we've got the balance roughly right on this. Our support for the
>lesser-used of the 19 primitive types is absolutely minimal, and withdrawing
>this support would remove about six paragraphs from the specs, which hardly
>seems worth the trouble.

I agree with Mike Kay that we have the balance right on this. We certainly 
spent a great deal of time on the issue, and have reduced our support from 
what we had previously in the Functions and Operators spec.

>TB 7. PIs and Comments
>
>The inclusion of Comment and PI in XQuery is further evidence of lack of
>attention to 80/20 thinking and cost/benefit trade-offs.
>
>MK: I disagree with you on this. As an XML database vendor...

In fact, it's pretty clear that some people think that support for 
comments, PIs, and CDATA sections is essential for their applications, and 
other people find them unnecessary. We reached consensus to support them. 
The DOM and XSLT both support them.

In fact XPath 1.0 supports comment() and processing-instruction() as node 
tests. That seems like pretty good precedent.

>TB 8. Relation to Schema Languages
>
>At the moment, by conscious design choice traceable back to the requirements
>documents, XQuery is quite strongly linked to W3C XML Schemas in several
>ways.
>
>In retrospect, this choice was unfortunate.  Fortunately, the situation can
>be rectified at moderate cost and with considerable benefit.
>
>MK: I agree with you that XML Schema is horribly over-complex. I don't agree
>that we can manage without it. There is no easy solution to this problem.


W3C XML Schema is the only typed schema language that has been widely 
adopted in commercial products. We need a typed schema language for our 
type system and data model. We also support merely-well-formed documents 
and documents governed by a DTD.

Jonathan

[1] http://www.w3.org/TR/2001/WD-xmlquery-use-cases-20010215
[2] http://www.w3.org/TR/2000/WD-xmlquery-req-20000131
[3] http://www.w3.org/TR/xmlquery-use-cases/#strong
Received on Friday, 18 October 2002 15:34:55 UTC