- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Fri, 18 Mar 2005 16:45:49 +0000
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
An update on possible ORDER BY clause for SPARQL.
1/ Process model.
First, we have a processing model of three stages:
A - A query has a pattern that generates a sequence of solutions.
B - There are some modifiers to this sequence (in this order)
Projection, ORDER BY, DISTINCT, LIMIT, OFFSET,
C - Process the modified sequence of solutions by the result forms.
It does make some sense for CONSTRUCT and DESCRIBE to have ORDER BY because of
slicing the results with LIMIT/OFFSET.
DISTINCT is only for SELECT, it's a no-op otherwise and is currently not allowed
by the grammar. This is merely for familiarity from SQL - could put the word
after the pattern in the query like the other modifiers.
"OFFSET 0" is the no-op. OFFSET applies after that many solutions have been
skipped.
2/ Ordering
Ordering is by a list of criteria, applied in the order given in the query. A
criterion is a function (and a simple case is just a variable) together with a
modifier for ascending or descending ordering.
The ordering criteria may not completely order the solution sequence.
e.g.
SELECT ?a ?b
...
ORDER BY ?a
or even
SELECT ?a
ORDER ?a
because of "03"^^xsd:integer and "3"^^xsd:integer
There is a requirement that there is always a consistent order applied (not
different each time) so that LIMIT/OFFSET work as slices.
One way is a default set of ordering rules that can always be applied to any
solutions. This would be based on further arbitrary ordering rules. See rq23
for some notes.
The other is just to leave it at arbitrary-but-consistent.
The only case I can see for the arbitrary/consistent approach would be
significant implementation gains but they would have to be proven first.
Therefore, I suggest going with the completely specified order and asking for LC
(or WG) feedback.
3/ Syntax
SQL's "ORDER BY" clause
XQuery also has an "order by" clause: it specifies the modifiers in full:
"ascending" and "descending". Each system can take an expression or a column
name (in SQL's case also a number).
Proposed syntax (examples):
SELECT *
WHERE { :x :p ?v . :x :q ?w }
ORDER BY ?v ?w
LIMIT 10
OFFSET 10
Confusion point: SPARQL does not have commas so I omitted them here too but then
ORDER BY ?v DESCENDING ?w
is confusing (it means descending-in-?v, ascending-in-?w but is very easy to
miss read).
Alternative syntax: break from SQL, XQuery and have
DESC(?v) ?w
or some other clear association of modifier with expression.
I prefer the (non-SQL) DESC(?x), ASC(xsd:integer(?x)) style for clarity.
4/ Ordering Expressions
We have a requirement (3.3) for extensible value testing. Therefore, I put in
expressions for ordering (like xquery, SQL) which allows types unknown to the
core language to be ordered. This also allows casting (useful for dates in
non-xsd:dateTime format or older RDF without datatypes).
ORDER BY xsd:integer(?v)
ORDER BY app:cordOrder(?x, ?y)
Such an ordering function must not cause an evaluation failure. If it does, it
is not determined whether any results, some results or all the results in some
junk order are returned.
5/ Misc
XQuery also has "empty greatest" and "empty least" and collation "name".
For use there are more than just the empty case (no value, bNodes, URIs and
string) so I propose picking a fixed relationship.
Collation is covered by what we do elsewhere.
Andy
Received on Friday, 18 March 2005 16:46:19 UTC