- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Fri, 18 Mar 2005 16:45:49 +0000
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
An update on possible ORDER BY clause for SPARQL. 1/ Process model. First, we have a processing model of three stages: A - A query has a pattern that generates a sequence of solutions. B - There are some modifiers to this sequence (in this order) Projection, ORDER BY, DISTINCT, LIMIT, OFFSET, C - Process the modified sequence of solutions by the result forms. It does make some sense for CONSTRUCT and DESCRIBE to have ORDER BY because of slicing the results with LIMIT/OFFSET. DISTINCT is only for SELECT, it's a no-op otherwise and is currently not allowed by the grammar. This is merely for familiarity from SQL - could put the word after the pattern in the query like the other modifiers. "OFFSET 0" is the no-op. OFFSET applies after that many solutions have been skipped. 2/ Ordering Ordering is by a list of criteria, applied in the order given in the query. A criterion is a function (and a simple case is just a variable) together with a modifier for ascending or descending ordering. The ordering criteria may not completely order the solution sequence. e.g. SELECT ?a ?b ... ORDER BY ?a or even SELECT ?a ORDER ?a because of "03"^^xsd:integer and "3"^^xsd:integer There is a requirement that there is always a consistent order applied (not different each time) so that LIMIT/OFFSET work as slices. One way is a default set of ordering rules that can always be applied to any solutions. This would be based on further arbitrary ordering rules. See rq23 for some notes. The other is just to leave it at arbitrary-but-consistent. The only case I can see for the arbitrary/consistent approach would be significant implementation gains but they would have to be proven first. Therefore, I suggest going with the completely specified order and asking for LC (or WG) feedback. 3/ Syntax SQL's "ORDER BY" clause XQuery also has an "order by" clause: it specifies the modifiers in full: "ascending" and "descending". Each system can take an expression or a column name (in SQL's case also a number). Proposed syntax (examples): SELECT * WHERE { :x :p ?v . :x :q ?w } ORDER BY ?v ?w LIMIT 10 OFFSET 10 Confusion point: SPARQL does not have commas so I omitted them here too but then ORDER BY ?v DESCENDING ?w is confusing (it means descending-in-?v, ascending-in-?w but is very easy to miss read). Alternative syntax: break from SQL, XQuery and have DESC(?v) ?w or some other clear association of modifier with expression. I prefer the (non-SQL) DESC(?x), ASC(xsd:integer(?x)) style for clarity. 4/ Ordering Expressions We have a requirement (3.3) for extensible value testing. Therefore, I put in expressions for ordering (like xquery, SQL) which allows types unknown to the core language to be ordered. This also allows casting (useful for dates in non-xsd:dateTime format or older RDF without datatypes). ORDER BY xsd:integer(?v) ORDER BY app:cordOrder(?x, ?y) Such an ordering function must not cause an evaluation failure. If it does, it is not determined whether any results, some results or all the results in some junk order are returned. 5/ Misc XQuery also has "empty greatest" and "empty least" and collation "name". For use there are more than just the empty case (no value, bNodes, URIs and string) so I propose picking a fixed relationship. Collation is covered by what we do elsewhere. Andy
Received on Friday, 18 March 2005 16:46:19 UTC