Response of the XML Query WG to comments by Jim Davis/DASL from Peter Fankhauser on 2000-03-30 (www-xml-query-comments@w3.org from March 2000)

From: Peter Fankhauser <fankhaus@darmstadt.gmd.de>
Date: Thu, 30 Mar 2000 15:25:45 +0200
To: www-xml-query-comments@w3.org, jrd3@alum.mit.edu, w3c-xml-query-wg@w3.org
Message-ID: <38E355D9.20402CB8@darmstadt.gmd.de>
Dear Jim Davis

thank you for your comments on the XML Query Requirements
(http://www.w3.org/TR/2000/WD-xmlquery-req-20000131).
The working group has discussed them and has come to the
following conclusions.

Jim Davis wrote:

> Thank you for the opportunity to comment on the requirements document.
> These comments are my own opinions.  I speak as one with extensive
> experience in interoperable digital libraries.  I am not representing my
> employer or the DASL design team.
>
> The most important comment is that I strongly urge you to design the query
> language in such a  way as to allow for well-defined subsets of the
> functionality.  I believe it would take substantial effort to implement a
> query processor that meets all the requirements you have outlined, and
> hence I would not expect to find high quality, widely available
> implementations soon.  By contrast, I think that many applications would be
> enabled by processors that supported only some, but not all, the
> functionality you outline.  It would be a pity if such applications were
> ruled out because of not implementing the full set of functions.
>

Some systems need a fairly simple query language that does not go beyond
XPath, others will need a more powerful query language like the one being
developed by the XML Query Working Group. It is not yet clear to us whether
there are well-defined subsets of the query language that should be
defined, and this is not likely to be clear until the query language is
developed and we are better able to assess the cost/benefit ratio of
various subsets. We will consider this suggestion at that time.

> As for the specific requirements you outline, there are some whose
> justification seems less than fully obvious to me.
>
> 3.2.1 query language syntax:
>
> Why MUST the query language syntax be convenient for humans to read and
> write?  Do you expect humans to type raw queries on keyboards?  Of the two
> widely used query languages now in existence (SQL and Z39.50) neither one
> is typed directly by humans.  (I concede that one *can* type SQL
> expressions, but contend that in the great majority of cases, a program
> acts as intermediary.  Thus for example, although one *could* enter an HTTP
> request by hand (and sometimes one does this, while debugging) in the
> overwhelming majority of cases, it is done via a program.
>
> Constraining a syntax to be "human friendly" is bound to introduce all
> kinds of unneeded complexity into the language and its processors.
> Consider the lesson of SGML for example.  The designers of SGML put in
> features to make SGML easier to type, but these features made processing
> SGML complex.  By contrast, XML syntax is easy to process.
>
> In addition, even if the query syntax is human readable, the results will
> surely be XML data.  If you consider raw XML to be "human friendly", then
> why not make the query also in XML.  If you think it is not friendly, then
> what good is it to make the query human friendly?
>

Requirement 3.2.1 is twofold: "One query language
syntax MUST be XML in a way that reflects the underlying
structure of the query language." and "One query language
syntax MUST be convenient for humans to read and write." While
XML is reasonably "human friendly" for data structures and
document structures, it can get a bit burdensome for syntactic
expressions. Therefore, the group decided to maintain the
possibility to have more than one syntax binding. One possible
approach there is to introduce (optional) syntactic shorthands
for certain modules of the language (such as path-expressions).


> 3.3.3 Collections
>
> If collections are not part of the current XML Infoset, how can you
> possibly design a query language to support them?
>

Collections of documents and collections of information items
are regarded as important by the XML Query Working Group.
Query languages such as SQL and XPath derive much of their
declarative power from working with collections.


> 3.3.4 References
>
> Does support for references also mean that an XML query processor MUST
> chase references to documents that are outside the "domain" of that
> processor?   See also 3.4.11
>

Yes, the XML query processor MUST in principle be
able to chase references to documents outside its domain,
expressed, for example, by URIs. When they cannot be resolved,
the query processor MUST raise an error. In addition, XML Query
may provide mechanisms to constrain reference chasing.


>
> 3.4.5 Combination
>
> Does this refer to combination when formulating a reply, or when
> determining which documents (or subtrees there of) match the query?  Is
> this part of structural transformation?  It would be helpful to have a use
> case motivating this requirement.
>

Combination has been introduced as a more neutral
term for (relational) joins, which are restricted to combining
flat tables only. Combination is a specific kind of structural
transformation, but is regarded as important enough to be
covered by a seperate requirement. For a use case see for
example the combination query in David Maier's "Database
Desiderata for an XML Query Language"
(http://www.w3.org/TandS/QL/QL98/pp/maier.html)


> 3.4.6 Aggregation
>
> While I appreciate that aggregation is often valuable, is it truly the case
> that if XML query did not support it, it would be utterly devoid of value?
> I don't deny the value of aggregation, but why is this essential?   Or put
> another way, if you are going to require aggregation, what is the set of
> mandatory summary operators?  mean? std deviation?  max?  min?
>
>

The aggregation facilities of SQL are regarded as
important by the XML Query Working Group.

> 3.4.10 Structural Transformation
>
> As for aggregation, I do not understand why this is mandatory.  If this
> feature were omitted, a client could still do the transformation client
> side, and the only price would be  that some extra data would have been
> transmitted.  All things being equal, the cost of doing the transformation
> is better paid by the clients since there are many of them, and not by the
> query processor.
>
> Does making this mandatory enable some non-trivial economy I am simply not
> aware of?
>

Some structural transformations, such as various
kinds of combination (joins), require specific index-
structures and/or query-optimization techniques, which can not
be easily delegated to the client. Additionally, the transformations may
greatly reduce the volume of results to be returned to
the client.

>
> 3.4.13 Literals
>
> I don't understand why this is only a SHOULD and not a MUST.  Perhaps I
> don't understand your language, but it seems to me that the very least a
> query language must provide is the ability to pass literal values in for
> comparison, e.g. (equals name "T Lee") or (Greater size 259).
>

"Literal Data" in the context of the XML Query
Requirements document denote XML fragments such as

<name><first>Jim</first><last>Davis</last></name>

Such literals are regarded as convenient but not essential,
because such fragments may be expressed in different
concrete syntaxes. The requirements document tries to not make
a decision for a particular concrete syntax.

Literals in the sense of constants (of a simple type) are
covered by requirement 3.4.1.

In the next version of the requirements document we will add a glossary term
for literal data:

"Literal Data": literal fragments of an XML document such as
<name><first>Jim</first><last>Doe</last></name>, which may
be used for comparison.

>
> 3.4.14 operations on names
>
> likewise.  I would assume that that it would be very common to want to test
>  an XML elements name or attribute value, etc.

The query language WG agrees that simple operations on names such as tests
for equality in element names, attribute names, and processing instruction
targets, and tests on the combination of names and data are very common and
a MUST. We will change the requirement accordingly.

> I notice that you have no requirement for sorting the results, unless that
> falls under transformations.  Did I miss it?
>

The WG agrees that sorting deserves a seperate requirement. We will add it
to the next version of the requirements document.

>
> finally, do you have any requirements for internationalization?  for
> example, one might want to ensure that when two natural language strings
> are compared, that the appropriate national language rules are used.
> There may or may not be similar issues for quantities such as dates, I am
> too ignorant to know.
>

The XML Query group is well aware of the ramifications of internationalization.
As indicated in Section 4 "Related Activities->Internationalization"
the XML Query group will work with the I18N WG and other WGs to make sure
that internationalization issues are covered adequately.

>
> thank you again for the chance to comment.
>
> best regards
>
> Jim
>

Best regards,

Peter Fankhauser
Received on Thursday, 30 March 2000 08:18:09 UTC