Re: XSchema integration, responsiveness, and a good solution to the problem from Jonathan Robie on 2002-10-24 (public-qt-comments@w3.org from October 2002)

From: Jonathan Robie <jonathan.robie@datadirect-technologies.com>
Date: Thu, 24 Oct 2002 14:36:23 -0400
To: Tim Bray <tbray@textuality.com>
Cc: "Kay, Michael" <Michael.Kay@softwareag.com>, public-qt-comments@w3.org
Message-Id: <5.1.0.14.0.20021024135648.04feb850@ncmail.datadirect-technologies.com>
At 08:35 PM 10/23/2002 -0400, Tim Bray wrote:
>Jonathan Robie wrote:
>
>>Most of this functionality is
>>implied by the requirements in our original requirements document, which
>>was first published in January 2000 [2]. There was strong agreement in
>>the Working Group to accept these use cases, and we have been proceeding
>>on this basis ever since then. The only major functionality added to the
>>use cases since then is support for W3C XML Schema - we now have a use
>>case that illustrates this functionality [3],
>
>... ah, I see we are in agreement then.

Yes, we had formal requirements to support XML Schema in our requirements 
document all along, but we did not have use cases to illustrate this. I 
agree with you that requirements should be illustrated by use cases, but in 
this imperfect world, we did not have a use case for strongly typed data 
until the current draft.

>>In your latest message on the XML Query Comments list, you indicate that
>>your preferred solution would be to ship Basic XQuery, which basically
>>leaves out support for W3C XML Schema complex types. I am assuming that
>>most of the complexity you are talking about here involves support for
>>complex types and the static semantics associated with this. Is that true?
>
>Indeed.  It would be an unalloyed blessing for the community, for users, 
>for vendors, and for the XQuery WG to move XQuery Basic to last call 
>forthwith and release it as XQuery 1.0 - with a bit of work you should be 
>able to get to Recommendation approximately a year earlier than if you 
>proceed with support for all the complex-type rocket science.

That was probably true at the time we got started, but I doubt that it's 
true now. I suspect that we are at the point that we will be done faster by 
moving ahead in our current direction than by trying to rip out complex 
types. Many of us think we can now see the light at the end of the tunnel.

In XQuery, complex types play the role of objects in object oriented 
languages. For instance, in the following function, it is important that we 
can ensure that an element is of the correct type in order to know that it 
will have a city, state, and zip code:

import schema "ipo.xsd"
import schema "zips.xsd"
declare namespace ipo="http://www.example.com/IPO"
declare namespace zips="http://www.example.com/zips"

define function zip-ok(element of type ipo:USAddress $a)
   returns xs:boolean
{
   some $i in document("zips.xml")/zips:zips/zips:row
   satisfies $i/zips:city = $a/ipo:city
         and $i/zips:state = $a/ipo:state
         and $i/zips:zip = $a/ipo:zip
}

XQuery without any complex types would be like Java without any classes 
more specific than Object. Or rather, it would be an odd language in which 
class declarations and objects could be instantiated, but the programming 
language per se could not speak of the classes and objects that are 
constructed. For instance, we can take the above function and rewrite it as 
follows:

import schema "ipo.xsd"
import schema "zips.xsd"
declare namespace ipo="http://www.example.com/IPO"
declare namespace zips="http://www.example.com/zips"

define function zip-ok(element $a)
   returns xs:boolean
{
   some $i in document("zips.xml")/zips:zips/zips:row
   satisfies $i/zips:city = $a/ipo:city
         and $i/zips:state = $a/ipo:state
         and $i/zips:zip = $a/ipo:zip
}

In this strange language, it is OK to use path expressions to search for 
city, state, or zip, but it is not OK to write a function signature that 
makes it clear that this function is applied to an element whose type 
guarantees that a city, state, and zip will be present. That severely 
restricts the amount of reasoning the system can do about types, both for 
type safety and for optimization.

The "rocket science" to which you refer is probably the Formal Semantics. 
The main reason for this work is to describe the static typing of queries. 
This is not for end-users, and perhaps not for many implementors. Some 
implementors consider the Formal Semantics crucial, others disagree. That 
is one reason that the static typing feature is optional in XQuery. There 
are some parts of the Formal Semantics that seem very important, such as 
Section 8, which shows how to map from a W3C XML Schema to the much simpler 
and more orthogonal representations we use internally in XQuery.

Incidentally, I do agree with you that we may need to support more schema 
languages in the future, though it does seem that DTDs, merely-well-formed 
XML, and W3C XML Schema are the main sources of data in the current world. 
For DTDs and merely-well-formed XML, we are defining mappings into our 
typed data model. RELAX-NG does not support type annotations, which makes 
it hard for us to support it as anything more than merely-well-formed data, 
but the RELAX-NG folks are working on type annotation even as we speak. I 
have exchanged email with several people in that group and offered opinions 
on the requirements for mapping RELAX-NG into the XPath/XQuery data model. 
But I think that support for RELAX-NG is something that can be done after 
XQuery 1.0 is released.

>>We have been very careful to publish our requirements and use cases very
>>early in the life of the Working Group. Changing our requirements and
>>use cases, which have both been published for quite a long time, and
>>have had the consensus of the Working Group for all this time, would be
>>extremely disruptive. As an AC Rep, I would not like to have my company
>>participate in Working Groups under the understanding that the
>>requirements may be changed years after consensus is achieved and the
>>requirements are posted.
>
>I agree, which is why I found the massive amount of specification-ware to 
>support functionality unsupported by use-cases to be so very highly 
>questionable, and why I'm concerned about the ex post facto insertion of 
>use cases after the design is already done.  -Tim

The use cases for strong typing were not added in order to introduce a new 
set of requirements or new work. The requirement to support both simple and 
complex types was there in our very first requirements document, and the 
Formal Semantics work has also been a visible part of our work from day 
one. I agree that we should have had the use cases for strong typing much 
earlier in the process, and we could probably have moved faster if we had 
done so.
Received on Thursday, 24 October 2002 14:37:01 UTC