Re: COMMENT? Re: Pagination in SPARQL OFFSET and LIMIT needs ORDER BY

Hi Jerven,

Thanks for your patience. This response aims to address comments 
  http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2011May/0016.html
  http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2011May/0017.html
 
>  Dear workgroup,
>  
>  I was recently made aware that there is no easy way to get a guaranteed working pagination.
>   
>  i.e. QUERY OFFSET 0 LIMIT 5 page 1
>       QUERY OFFSET 5 LIMIT 5 page 2
>       QUERY OFFSET 10 LIMIT 5 page 3
>  
>  Without adding an ORDER BY clause. Adding any kind of ORDER BY clause would be enough to 
>  ensure pagination worked. I would therefore like to see an  ORDER BY * or ORDER BY ANY 
>  option. To ensure that the results come  in some implementation specific order and that 
>  this can be used to show all possible results.
>  
>  Trying a few public current SPARQL implementations. With ORDER BY * showed that this is 
>  currently not implemented. Although pagination with OFFSET and LIMIT without an ORDER BY 
>  clause  seems to work as a naive user (e.g. me) would expect. Meaning that for current 
>  SPARQL implementers it is no work at all other than dealing with a slightly different 
>  SPARQL grammar.
>  
>  Pagination guaranteed to succeed would then be 
>  
>  i.e. QUERY OFFSET 0 LIMIT 5 ORDER BY ANY page 1
>       QUERY OFFSET 5 LIMIT 5 ORDER BY ANY page 2
>       QUERY OFFSET 10 LIMIT 5 ORDER BY ANY page 3
>  
>  The other option is to expand the description of the OFFSET clause. For example the use 
>  of the OFFSET clause should guarantee that query results come back in a consistent order.
>  
>  I hope this concern makes sense.
>  
>  Regards,
>  Jerven

You are right in the observation that the spec does not prescribe pagination 
via OFFSET and LIMIT to be reproducible unless combined with a total order 
over the results.

First of all, note that a shortcut like "ORDER BY *" as you suggest would not 
guarantee a predicable total order of results, because 
a) ORDER BY does not guarantee a total order, cf. http://www.w3.org/TR/sparql11-query/#modOrderBy and 
b) for instance when blank nodes are returned two separate calls are not guaranteed to return the same blank node identifiers.

Further, the proposed feature/behavior is beyond the current scope of our 
charter [1]. When we had discussed the selection of features to be addressed 
in this round of SPARQL, a related proposal was on the table [2] but didn't 
find a majority within the group. (Our selection of additional features/extensions 
to be addressed in SPARQL1.1 was made by support and available resources within 
the group.)

Note that, another reason why we do not support this as a query language feature,
is that this should be considered rather a protocol issue: existing HTTP mechanisms
are applicable like ETags for consistency and ranges for slicing. Client side 
paging off a stream of results is also a candidate mechanism.

Cursors, paging, (transactions) etc are about controlling the flow of results 
and about results over multiple requests, not in defining results.

This clear-cut separation is also important in the context of SPARQL1.1 Update: 
potential interactions with update, system restarts and anything that means the 
server would loose state or simple re-execution of the query would produce 
different answers even in a deterministic query processor.

So, we are afraid at this stage and within the remaining resources of the 
working group, we won't be able to address this suggestion in the current 
working group. 

We would be grateful if you would acknowledge that your comment has been 
answered by sending a reply to this mailing list.

with best regards,
Axel, on behalf of the SPARQL WG

1. http://www.w3.org/2009/05/sparql-phase-II-charter.html
2. http://www.w3.org/2009/sparql/wiki/Feature:Cursors


On 6 Oct 2011, at 22:09, Jerven Bolleman wrote:

> Dear Workgroup,
> 
> I have not seen any discussion of my suggestion the OFFSET works deterministically. Has this been discussed in the workgroup and been disregarded as a bad idea or breaking compatibility? Or has my comment/question slipped through the cracks? Or is it now to late as July 29th has passed?
> 
> Regards,
> Jerven
> 

Received on Sunday, 23 October 2011 09:49:52 UTC