RE: New "order by" clause

I wrote: 

I don't see it stated explicitly, but I suspect that the motivation for
going to "order by" is to allow easy mapping of XQuery  expressions onto
an 
underlying relational database.  Would you say that this is the case?

Jonathan responds:

Not only onto relational databases, but also indexes in native XML 
databases. FWIW, I have worked for three different companies 
that do native  XML databases, and I care about implementability on both

native XML and  relational stores.

My response:

I, too, am currently working on a native XML database.  In our
implementation,
the sorting facility doesn't rely on indexing, so I guess I don't see
why there
should be a connection.  Won't any XQuery implementation need to have
a general-purpose sorting facility (for either "sort by" or "order by")?
And if
that is the case, aren't we just arguing about syntax?

Consider this degenerate case:

	(<word>this</word>, <word>that</word>) sort by(.)

If you express this as

	for $x in (<word>this</word>, <word>that</word>) 
	order by ($x)
	return $x

then an implementation won't be able to use indexing to perform the sort
anyway.
And if you don't support this particular usage, are you supporting the
entire XQuery
language?

I wrote: 

And is that a good rationale for designing a language feature?

Jonathan responds: 
Efficient implementability in the range of environments where 
we expect our  language to be deployed? Yes, I would say that's a good
rationale.

My response:

See the above example.  Not all expressions can be ordered by using
indexes.

I wrote:
I found "sortby" (or "sort by") to be intuitively pretty simple, and 
more general than "order by".  The latter complicates simple 
queries; I can't say
 
        document("mystuff.xml")//name sortby(.)

Jonathan responds:

Right, you would say:
 
for $n in document("mystuff.xml")//name
order by $n/name
return $n

My response:

I agree that it's possible.  What I don't agree is that this is a
"concise and easily understood" way to express this.   It is more
familiar to a SQL programmer, perhaps, if that's a design goal.

Jonathan writes:

This was discussed at length, and we examined quite a few queries along 
these lines. One proposal was to retain both sort by () and  the order
by 
clause, defining the one in terms of the other. When the vote  came,
people 
seemed to prefer having only one way to sort.
 
Let me make sure I understand your position - would you 
prefer having two, or would you prefer that we remove 'order by'
altogether?

My response:

I would prefer the removal of "order by".  I think it's less general
than "sort by",
more verbose and harder to understand in simple cases, and doesn't add
any
functionality (unless I've overlooked something, you can always apply a
separate
"sort by" to each "for" expression in nested FLWR statements to get the
same result
as a multi-level "order by"; or you can sort the output of the FLWR,
which you can't with
"order by").

It is possible that my opinion is colored by the fact that my company
has a released
product that implements "sortby" (not "sort by", that wasn't in the
August draft),
and that I don't want to have to re-work it (or, especially, to rewrite
the manuals :-).

But I'm not convinced that making XQuery more SQL-like really serves any
higher design
purpose; this language is fundamentally based around sequences, not
tuples created by
joins, and sorting a sequence seems very natural.  Forcing the
programmer to iterate over
the sequence, when all he wants to do is to sort it, does not.

I appreciate your comments,

-- Jim Davies (jdavies@neocore.com)

Received on Tuesday, 19 November 2002 17:06:34 UTC