RE: New "order by" clause

At 10:13 AM 11/19/2002 +0100, Bas de Bakker wrote:

>You seem to assume that the "sort by" is applied to the whole FLWR
>expression.

Well, I assume that the sort by *may* be applied to the whole FLWR 
expression, and often is in queries I see written by a variety of people.

>While this is allowed, I usually use it on the "for" input
>as in my examples in my previous message (or even without a FLWR
>expression at all):
>
>for $x in ... sort by (...)
>return ...
>
>I realize this may be because I'm aware that it is easier to optimize.

Yes, this is a good way of dealing with the way most implementations work, 
but it does amount to writing your queries in a particular way because you 
know that sort by is, in the general case, hard to optimize when element 
constructors intervene between input and output.

>Of course, I can do this with "order by", too.  But "sort by" is easier
>to use in other contexts.  Instead of
>
>Expr1 sort by (Expr2)
>
>I now have to write
>
>for $x in Expr1
>order by $x/Expr2
>return $x

True - and we discussed whether to keep sort by in addition to the order by 
clause. Many people felt that two different ways to do sorting was too much 
in a 1.0 release.

>And you will have to spend more time,
>because there is no formal semantics for "order by" yet, which will
>require doing something with tuples and prohibits normalizing FLWR
>expressions to one "for" clause each.  (And, in my experience, if the
>formal semantics are more difficult, rewriting for optimization purposes
>is usually more difficult, too.)

I'll leave this for one of the formal semantics people to answer. To me, 
though, optimization should be based on a model that allows interesting 
orders to be selected from the input.

> > I'm still not at all sure what a 'group' is or should be in
> > the XQuery data model,
>
>I never said I liked the data model.  On the contrary, I think it
>severely limits the expressiveness of the language.  The difficulty of
>defining "group by" is the most obvious example of this.



> > or what class of problems your users want solved under
> > the concept
> > of 'group by'. Could you fill me in on how you see this?
>
>I don't know whether I can add much to other public comments on this
>topic.  The problem is that a query like
>
>for $author in distinct-values(/items/books/book/author)
>let $books := /items/books/book[author = $author]
>return <author name="{$author}">
>{ for $b in $books
>   return <title>{$b/name}</title>
>}</author>
>
>is awkward (though admittedly possible) to write, because you need to
>repeat information, in this case the "/items/books/book/author" path.
>This would be easier to write and optimize with a grouping construct
>like
>
>group $books in /items/books/book
>by value $author := ./author
>return ...
>
>where the return clause is evaluated once per distinct author, with
>$books set to a sequence of all books with that author.

Your solution works nicely for this one example, but I'm not sure if it 
generalizes. Suppose I want to group by the author's last name, then first 
name. How would I do this?

Jonathan

Received on Tuesday, 19 November 2002 08:04:16 UTC