- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Fri, 11 Mar 2005 13:46:54 +0000
- To: Dan Connolly <connolly@w3.org>
- Cc: DAWG Mailing List <public-rdf-dawg@w3.org>
Dan Connolly wrote: > > Recent comments traffic suggests a SORT feature/requirement... > I asked for a use case, and got... > > [[[ > Use case: obtain a given number of most-recent items from a > triplestore-based RSS aggregator. (Something along the lines of > http://pubsub.com which aggregates data from several million feeds - > they use ASN.1 internally btw, though expose XML interfaces). > ]]] > -- > http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Mar/ > 0002.html > > Anybody like the idea of addressing that use case in SPARQL? > Any design ideas or implementation experience with sorting? > > Hmm... hard to argue that it's a bad idea; I think the discussion we've > had is more of the form "it can wait"; i.e. perhaps it belongs on the > issues list as "postponed". > > Thoughts? > I think we need to do due diligence on this one. This is not the first time it has been requested and we need to give a full response. Putting aside whether we do address the matter in this working group or postpone, I thought I'd take a pass over what it might involve. 1/ Syntax Only for SELECT queries. SQL ORDER BY would be a starting inspiration: ORDER BY ?x ?y or ORDER BY ?x ASC ?y DESC SQL has comma'ed lists so maybe something a liitle clerer would be: ORDER BY ASC(?x) DESC(?y) ORDER BY ?x(ASC) ?y(DESC) 2/ Defining order RDF is semistructured and unconstrained as the types of values that can appear - unlike SQL, we can't guarantee a column in the result table is always the same type so we need to define what happens. The simplest solution is to make sorting unlike things (that is, no "<" operation between them) are an error, causing query execution abort with no results . That is a consistent but rather savage given that not all RDF out there is well written. Because not every pair needs to be compared in a sort, it might give rise to strange corner cases. A variation (more a reframing than a different design) is to only define what happens in the case of comparable items and leave the rest to implementations. We could impose an arbitrary ordering amongst unlike things: something like (off the top of my head): + undef (NULL) < bNodes < URIs < literals + Literals of unknown type compare on lexical form unless the processor has additional knowledge. + numbers and plain strings compare on lexical form. Are there other design dimensions? Andy
Received on Friday, 11 March 2005 13:47:08 UTC