Re: [XQuery] IBM-XQ-007: Last step in a path expression from Daniela Florescu on 2004-02-12 (public-qt-comments@w3.org from February 2004)

From: Daniela Florescu <danielaf@bea.com>
Date: Wed, 11 Feb 2004 17:31:43 -0800
To: Don Chamberlin <chamberl@almaden.ibm.com>
Cc: public-qt-comments@w3.org
Message-Id: <36E74C54-5CFB-11D8-9286-0003937198F4@bea.com>

Don,

but in this new proposal, under which conditions would you apply sorting
by doc order  and duplicate elimination ?

What if the dynamic answer contains a mixture of nodes and values?

And what if the statically inferred type contains both nodes and values  
?
Don't you want to know at compile time if you have to do a sort or not ?

I am not sure I understand the proposal as written.

Best regards
Dana


On Feb 11, 2004, at 3:50 PM, Don Chamberlin wrote:

>
> (IBM-XQ-007) Section 3.2 (Path Expressions): The definition of a path  
> expression should be revised to remove the restriction that the  
> expression on the right side of "/" must return a sequence of nodes.  
> The restriction should be retained for the expression on the left side  
> of "/". In effect, this would permit the last step in a path to return  
> one or more atomic values. This feature has recently been requested by  
> Sarah Wilkin  
> (http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/ 
> 0100.html) who proposes the following rule: When evaluating E1/E2, if  
> each evaluation of E2 returns a sequence of nodes, they are combined  
> in document order, removing duplicates; if each evaluation of E2  
> returns a sequence of atomic values, the sequences are concatenated in  
> the order generated; otherwise a type error is raised. Like all type  
> errors, this error can be raised either statically or dynamically,  
> depending on the implementation. This rule provides well-defined  
> static and dynamic semantics for path expressions.
>
> To illustrate the usability advantages of this proposal, consider a  
> document containing "employee" elements, each of which has child  
> elements "dept", "salary", and "bonus". To find the largest total pay  
> (salary + bonus) of all the employees in the Toy department, here is  
> what I think many users will write:
>
> max( //employee[dept = "Toy"]/(salary + bonus) )
>
> Unfortunately in our current language this is an error because the  
> final step in the path does not return a sequence of nodes. The user  
> is forced to write the following:
>
> max( for $e in //employee[dept = "Toy"] return ($e/salary + $e/bonus) )
>
> This expression is complex and error-prone (users will forget the  
> parentheses or will forget to use the bound variables inside the  
> return clause). There is no reason why this query cannot be expressed  
> in a more straightforward way. Users will try to write it as a path  
> expression and will not understand why it fails.
>
> Another very common example is the use of data() to extract the typed  
> value from the last step in a path, as in this case:  
>  //book[isbn="1234567"]/price/data().  This very reasonable expression  
> is also an error and the user is forced to write  
> data(//book[isbn="1234567"]/price).
>
> Note that I am NOT asking for a general-purpose mapping operator,  
> which I think is not in general needed since we already have a  
> for-expression. Instead, I think we should simply relax the unnatural  
> and unnecessary restriction that is currently placed on path  
> expressions. This will remove a frequent source of errors and will  
> improve the usefulness of path expressions, without precluding us from  
> introducing a general-purpose mapping operator later if a consensus  
> emerges to do so.
>
> --Don Chamberlin

Attachments

text/enriched attachment: stored

Received on Wednesday, 11 February 2004 20:31:18 UTC