RE: [XQuery] IBM-XQ-007: Last step in a path expression from Michael Kay on 2004-02-12 (public-qt-comments@w3.org from February 2004)

From: Michael Kay <mhk@mhk.me.uk>
Date: Thu, 12 Feb 2004 10:24:07 -0000
To: "'Daniela Florescu'" <danielaf@bea.com>, "'Don Chamberlin'" <chamberl@almaden.ibm.com>
Cc: <public-qt-comments@w3.org>
Message-ID: <000001c3f152$664e0fb0$6401a8c0@pcukmka>
I do understand the proposal as written, and am ambivalent about it. We
do need a mapping operator, but I greatly prefer to use a new operator
("!"), because the semantics are sufficiently different from "/" as to
cause confusion if "/" is overloaded, and a general mapping operator
would also allow atomic values on the left.
 
It seems odd to me to allow people to do a/name() but not
a/name()/string-length().
 
When people call user-defined functions (or even system-defined
functions) they aren't always very knowledgeable about whether the
function is returning a set of strings or a set of text nodes, even if
they wrote the function themselves. It will be confusing if the two
cases behave differently.
 
Michael Kay

-----Original Message-----
From: public-qt-comments-request@w3.org
[mailto:public-qt-comments-request@w3.org] On Behalf Of Daniela Florescu
Sent: 12 February 2004 01:32
To: Don Chamberlin
Cc: public-qt-comments@w3.org
Subject: Re: [XQuery] IBM-XQ-007: Last step in a path expression


Don,

but in this new proposal, under which conditions would you apply sorting

by doc order and duplicate elimination ?

What if the dynamic answer contains a mixture of nodes and values?

And what if the statically inferred type contains both nodes and values
?
Don't you want to know at compile time if you have to do a sort or not ?

I am not sure I understand the proposal as written.

Best regards
Dana


On Feb 11, 2004, at 3:50 PM, Don Chamberlin wrote:




(IBM-XQ-007) Section 3.2 (Path Expressions): The definition of a path
expression should be revised to remove the restriction that the
expression on the right side of "/" must return a sequence of nodes. The
restriction should be retained for the expression on the left side of
"/". In effect, this would permit the last step in a path to return one
or more atomic values. This feature has recently been requested by Sarah
Wilkin
(http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0100.htm
l) who proposes the following rule: When evaluating E1/E2, if each
evaluation of E2 returns a sequence of nodes, they are combined in
document order, removing duplicates; if each evaluation of E2 returns a
sequence of atomic values, the sequences are concatenated in the order
generated; otherwise a type error is raised. Like all type errors, this
error can be raised either statically or dynamically, depending on the
implementation. This rule provides well-defined static and dynamic
semantics for path expressions. 

To illustrate the usability advantages of this proposal, consider a
document containing "employee" elements, each of which has child
elements "dept", "salary", and "bonus". To find the largest total pay
(salary + bonus) of all the employees in the Toy department, here is
what I think many users will write: 

max( //employee[dept = "Toy"]/(salary + bonus) ) 

Unfortunately in our current language this is an error because the final
step in the path does not return a sequence of nodes. The user is forced
to write the following: 

max( for $e in //employee[dept = "Toy"] return ($e/salary + $e/bonus) ) 

This expression is complex and error-prone (users will forget the
parentheses or will forget to use the bound variables inside the return
clause). There is no reason why this query cannot be expressed in a more
straightforward way. Users will try to write it as a path expression and
will not understand why it fails. 

Another very common example is the use of data() to extract the typed
value from the last step in a path, as in this case:
//book[isbn="1234567"]/price/data().  This very reasonable expression is
also an error and the user is forced to write
data(//book[isbn="1234567"]/price). 

Note that I am NOT asking for a general-purpose mapping operator, which
I think is not in general needed since we already have a for-expression.
Instead, I think we should simply relax the unnatural and unnecessary
restriction that is currently placed on path expressions. This will
remove a frequent source of errors and will improve the usefulness of
path expressions, without precluding us from introducing a
general-purpose mapping operator later if a consensus emerges to do so. 

--Don Chamberlin
Received on Thursday, 12 February 2004 05:24:01 UTC