W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > March 2005

Re: Sorting

From: Danny Ayers <danny.ayers@gmail.com>
Date: Wed, 9 Mar 2005 10:09:50 +0100
Message-ID: <1f2ed5cd05030901096f639fc8@mail.gmail.com>
To: Dan Connolly <connolly@w3.org>
Cc: public-rdf-dawg-comments@w3.org, Leigh Dodds <leigh@ldodds.com>

On Tue, 8 Mar 2005 17:21:40 -0600, Dan Connolly <connolly@w3.org> wrote:
> On Mar 8, 2005, at 3:49 PM, Danny Ayers wrote:
> > On Tue, 8 Mar 2005 15:08:16 -0600, Dan Connolly <connolly@w3.org>
> > wrote:
> >>
> >>> Is it the case then that Sparql will not include an ORDER BY or
> >>> similar clause? If not, then would it be possible to elaborate
> >>> why?
> >>
> >> We adopted a LIMIT requirement over an objection (details below), but
> >> sorting has never had a critical mass of support. It competes with
> >> streaming results (http://www.w3.org/2001/sw/DataAccess/UseCases#r3.12
> >> ). We haven't discussed any designs for sorting.
> >
> > I don't understand why sorting should compete with streaming - isn't
> > the transport at a different layer than order?
> 
> Er... no. The bytes coming over the wire (in the variable binding
> results)
> are ordered. In order to sort the results, you can't send one along as
> soon as
> you establish that it matches the query; you have to wait until
> you have all the results, since the last one you find might be
> the first one by the sort order.

Ah, ok.

> >> You're welcome to elaborate on why you think it's important/required.
> >> Use cases are particularly welcome, especially use cases that argue
> >> for
> >> handling sorting in SPARQL rather than in a downstream component or
> >> client or XSLT engine or the like.
> >
> > Use case: obtain a given number of most-recent items from a
> > triplestore-based RSS aggregator. (Something along the lines of
> > http://pubsub.com which aggregates data from several million feeds -
> > they use ASN.1 internally btw, though expose XML interfaces).
> >
> > I may be missing something, but the most natural way I can think of
> > doing this is using a combination of ORDER BY and LIMIT.
> 
> True, to get the "most recent N" you need both ORDER BY
> and LIMIT. I think we discussed that case and there wasn't
> a sense that it was critical for SPARQL 1.0, but I don't know
> where that discussion is recorded, so I guess I'll ask again.

There is huge deployment of the RSS vocab (and it's near-RDF
variants), and the most-recent-n operation is the first anyone using
the stuff is likely to want, which seems quite a strong argument for
ORDER BY & LIMIT (or a functionally equivalent alternative).

Personally I'd favour their inclusion simply on grounds of easing
skill migration for SQL-savvy folks.

Cheers,
Danny.


-- 

http://dannyayers.com
Received on Wednesday, 9 March 2005 09:09:51 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:14:48 GMT