XSchema integration, responsiveness, and a good solution to the problem

This is motivated by Ashok Malhotra's message entitled 'Response to Tim 
Bray's comment "PSVI Considered  Dangerous"' at

http://lists.w3.org/Archives/Public/public-qt-comments/2002Sep/0019.html.

which is the XQuery WG's official response to my submission at
  http://lists.w3.org/Archives/Public/public-qt-comments/2002Jul/0007.html

For brevity I'll refer to this as "Ashok's message" even though I 
understand it's a group production.

The problem: Ashok's message is non-responsive to the point that it 
would not be remotely acceptable as part of a CR-stage "resolution of 
comments".

The solution: In the (very good) August 16th draft of XQuery at
  http://www.w3.org/TR/2002/WD-xquery-20020816/
section 2.5.2.1 defines "Basic XQuery".  The XQuery WG should seriously 
consider shipping this as "XQuery 1.0", and taking their time developing 
a second specification, "XQuery, XML Schema, and Complex Typing".  This 
would allow XQuery 1.0 to achieve Recommendation status by year-end or 
early in the new year with a great increase in quality and interoperability.

===details======================================

One of the weirder anomalies about Ashok's message is its title, which 
claims that my comment was "PSVI Considered Dangerous".  This is deeply 
puzzling since the only mention of the PSVI in my message is an 
observation that XQuery gives a procedure for how to construct its data 
model using the PSVI as input, and that this isn't a problem.  This is 
only one of the things about the message that makes me wonder whether 
the  comments were actually read or given any serious consideration by 
the XQuery working group.

Another concern about XQuery's response:  My original message contained 
8 separate points; Ashok's message ignores 1 through 7 and focused on 
this one alone.  This is troubling because some of the other points are 
closely related, for example the troubling lack of use-cases for any of 
the schema/type facilities, and the immensely long time it is taking to 
release XQuery, which will have negative consequences for interoperability.

In the following, I use "XQuery" to mean the whole suite of XML 
Query-centric specifications, and "XSD" to refer to W3C XML Schema.

Ashok:
 >In his comment Tim voices concern that XQuery is dependent on XML
 >Schema, both because of the size and complexity of XML Schema, and
 >because of the potential that the use of other schema languages will
 >make interoperability of XQuery problematic.  We respond to these
 >comments
 >below.

This summary silently bypasses several of the problems I pointed out, in 
particular XSD's unsuitability for particular application classes and 
the high cost of cross-secification dependencies.  Further evidence of 
non-responsiveness.

Ashok:
 >XQuery made the decision to support XML Schema, in spite of the
 >complexity that implied.  Our choice was made for several reasons:
 >* there was strong W3C guidance in favor of re-use
 >        between the different working groups and recommendations,
 >	* the belief that many (most) of the information that would
 >	  be queried would be typed by XML Schema.
 >	* the belief that we would better support interoperability if we
 >
 >        avoided the temptation to do our own language,
 >		* and the belief that a tight coupling with XML Schema
 >would, in
 >        fact, benefit both standards.
 >
 >It is worth expanding on the last point: there are features of XML
 >Schema that may or may not become widely used; surely this will be
 >influenced, in part, by whether or not other tools can respect and
 >exploit them. While this is clearly a decision for the Architecture
 >team, we feel rather strongly that both XQuery and XML Schema are made
 >more effective by mutual support, and that lack of support for XML
 >Schema features in XQuery can only hurt XML Schema acceptance and use.

Taking the points out of order:

I grant the first point: that the XQuery WG perceived it had received 
strong W3C guidance to use XSchema.

The final point is also process/policy related and nontechnical: that 
XQuery's use of XSD would be good for XSD.  I think it's highly 
questionable that the technical design of one W3C spec should be 
compromised in order to help the acceptance of another, but as Ashok 
notes, this (and the previous point) is an architecture issue and I have 
raised it with W3C TAG.

The third point (the WG not inventing their own language) is good common 
sense, but entirely unrelated to what I suggested.

That only leaves one *technical* argument in favor of wiring XQuery to 
XML Schema: the belief outlined in the second point that "many (most) of 
the information" that would be queried would be typed by XML Schema." 
This seems like a highly risky and speculative two-step assumption. 
First, that a majority of the XML instances in the world will be 
schema-typed at all, and second, that of those, a majority will use XML 
Schema.  Is there any data behind this assumption or is it simply an 
intuition on the part of the WG?

I am highly unimpressed by this style of argument-by-asssertion.  In 
another part of my lengthy multi-part XQuery comment, I noted that 
XQuery was generously provided with use-cases for almost all features of 
the language that did not involve the type system, and had almost no 
use-cases for any part of the language that did involve the type system. 
  I note that now the XQuery WG is working, ex post facto, on some use 
cases for the type query facilities, after having designed those 
facilities.  Does this not raise questions in anyone's mind?

A couple of other points are worth visiting:

Ashok:
 >	* We have designed the language so that a schema is not
 >required,
 >	  by carefully defining the contents of a data model and type
 >	  annotations so that they could be constructed without the use
 >	  of a schema, and by providing a conformance level that makes
 >	  no use of XML Schema other than the primitive types.

This is true and I salute the XQuery WG's fine work here.  I suggest 
that they should publish this conformance level as XQuery 1.0, which 
would enable them to produce a much higher quality recommendation with 
less complexity, many fewer bugs, and many fewer inter-specification 
dependencies, and do to so much sooner.

Ashok:
 >The final question is whether or not XQuery would be better served with
 >a much simpler type system generally.  This is really a philosophical
 >discussion all to its own, but, suffice it to say here that the working
 >group has always felt that a strong, robust and powerful type system was
 >fundamental to achieving the goals we set out to achieve, both in terms
 >of enabling performance optimization and providing features to the user.
 >The main question for us was whether to align this with XML Schema or
 >not, and we have done so for the reasons alluded above.

First, the existence of "Basic XQuery" in the most recent draft 
demonstrates clearly that a highly usable facility can be built without 
excessive linkage to complex types or XSchema.   Second, assuming you 
are correct, it is also clear that the *cost* of linkage to complex 
types and XSchema, expressed in terms of years of WG time, is very high, 
and that the cost/benefit ratio needs revisiting.  Finally, I suggest 
that there is no basis of industry experience from which to infer that 
that XML Schema provides a "strong, robust, and powerful type system".

Ashok:
 >Finally we cannot, realistically, adopt a "configurable" type system
 >that would support a variety of different type systems.  This requires a
 >great
 >deal of research in a difficult area.  Again, we do not think this is an
 >appropriate activity for the XML Query WG and undertaking it would lead
 >to significant delay.

I can only describe the above as outrageously non-responsive.  I would 
ask those concerned to look at the last few paragraphs of my message 
where I provide detailed, specific suggestions for how XQuery could be 
made schema-language-agnostic.  These go entirely unaddressed beyond a 
suggestion that I'm calling for a "configurable" type system and that 
this would require research.  I make two claims:

1. Given a query system which has callouts to an XSD processor to 
provide the functions described in XQuery, it ought to be 
straightforward to replace this with any schema processor that deals in 
terms of types identified by qnames.
2. If I am wrong and it is impossible to define named-type-based 
querying without tight integration with a particular schema language, 
this is a powerful argument for removing such facilities from V1.0 of 
XQuery.

Conclusion

Dear XQuery WG: I am one of your fans.  I think that you have done 
terrific work that has a high chance of being rapidly implemented in the 
field and changing the world.  The complex, weighty type apparatus has, 
however, cost you years in delivery time (and will likely cost more), 
has immensely increased the difficulty of the implementers' task, and 
amounts to a bet on a radical set of new and unproven technologies.

Basic XQuery, however, is based on a technology that has already proven 
itself in the field: XPath.  The extension of this from addressing to 
querying semantics is something that the XQuery drafts have already done 
a good job on; you should be proud of yourselves, and you should ship 
XQuery 1.0 before the end of the year (entirely possible if you back off 
the typing complexity) and then sit back to see which way the confused, 
unstable world of schemas and typing for XML evolves in.  -Tim

Received on Monday, 14 October 2002 04:44:29 UTC