comments on XML-QUERY Requirements Document from Jim Davis on 2000-02-11 (www-xml-query-comments@w3.org from February 2000)

From: Jim Davis <jrd3@alum.mit.edu>
Date: Fri, 11 Feb 2000 16:14:49 +0100
To: Massimo Marchiori <massimo@w3.org>, www-xml-query-comments@w3.org
Cc: www-webdav-dasl@w3.org
Message-Id: <4.1.20000211153514.009fa130(null)>
Thank you for the opportunity to comment on the requirements document.
These comments are my own opinions.  I speak as one with extensive
experience in interoperable digital libraries.  I am not representing my
employer or the DASL design team.

The most important comment is that I strongly urge you to design the query
language in such a  way as to allow for well-defined subsets of the
functionality.  I believe it would take substantial effort to implement a
query processor that meets all the requirements you have outlined, and
hence I would not expect to find high quality, widely available
implementations soon.  By contrast, I think that many applications would be
enabled by processors that supported only some, but not all, the
functionality you outline.  It would be a pity if such applications were
ruled out because of not implementing the full set of functions.

As for the specific requirements you outline, there are some whose
justification seems less than fully obvious to me.

3.2.1 query language syntax:

Why MUST the query language syntax be convenient for humans to read and
write?  Do you expect humans to type raw queries on keyboards?  Of the two
widely used query languages now in existence (SQL and Z39.50) neither one
is typed directly by humans.  (I concede that one *can* type SQL
expressions, but contend that in the great majority of cases, a program
acts as intermediary.  Thus for example, although one *could* enter an HTTP
request by hand (and sometimes one does this, while debugging) in the
overwhelming majority of cases, it is done via a program.

Constraining a syntax to be "human friendly" is bound to introduce all
kinds of unneeded complexity into the language and its processors.
Consider the lesson of SGML for example.  The designers of SGML put in
features to make SGML easier to type, but these features made processing
SGML complex.  By contrast, XML syntax is easy to process.

In addition, even if the query syntax is human readable, the results will
surely be XML data.  If you consider raw XML to be "human friendly", then
why not make the query also in XML.  If you think it is not friendly, then
what good is it to make the query human friendly?

3.3.3 Collections

If collections are not part of the current XML Infoset, how can you
possibly design a query language to support them?  

3.3.4 References

Does support for references also mean that an XML query processor MUST
chase references to documents that are outside the "domain" of that
processor?   See also 3.4.11

3.4.5 Combination

Does this refer to combination when formulating a reply, or when
determining which documents (or subtrees there of) match the query?  Is
this part of structural transformation?  It would be helpful to have a use
case motivating this requirement.

3.4.6 Aggregation

While I appreciate that aggregation is often valuable, is it truly the case
that if XML query did not support it, it would be utterly devoid of value?
I don't deny the value of aggregation, but why is this essential?   Or put
another way, if you are going to require aggregation, what is the set of
mandatory summary operators?  mean? std deviation?  max?  min? 

3.4.10 Structural Transformation

As for aggregation, I do not understand why this is mandatory.  If this
feature were omitted, a client could still do the transformation client
side, and the only price would be  that some extra data would have been
transmitted.  All things being equal, the cost of doing the transformation
is better paid by the clients since there are many of them, and not by the
query processor.

Does making this mandatory enable some non-trivial economy I am simply not
aware of?

3.4.13 Literals

I don't understand why this is only a SHOULD and not a MUST.  Perhaps I
don't understand your language, but it seems to me that the very least a
query language must provide is the ability to pass literal values in for
comparison, e.g. (equals name "T Lee") or (Greater size 259).

3.4.14 operations on names

likewise.  I would assume that that it would be very common to want to test
 an XML elements name or attribute value, etc.

I notice that you have no requirement for sorting the results, unless that
falls under transformations.  Did I miss it?

finally, do you have any requirements for internationalization?  for
example, one might want to ensure that when two natural language strings
are compared, that the appropriate national language rules are used.
There may or may not be similar issues for quantities such as dates, I am
too ignorant to know.

thank you again for the chance to comment.

best regards

Jim
Received on Friday, 11 February 2000 12:05:58 UTC