Re: GROUP BY [ok?] from Richard Newman on 2005-08-30 (public-rdf-dawg-comments@w3.org from August 2005)

From: Richard Newman <rich@holygoat.co.uk>
Date: Tue, 30 Aug 2005 11:55:08 -0700
To: "Bruce D'Arcus" <bdarcus@gmail.com>
Cc: public-rdf-dawg-comments@w3.org
Message-Id: <C881A492-857B-4FA7-B4AE-B01D80BE72D6@holygoat.co.uk>

On 30 Aug 2005, at 11:31, Bruce D'Arcus wrote:

>
> Well, OK, yes, I (an RDF novice I will add) think it'd be nice if  
> SPARQL had GROUP BY support a la SQL.  I had simply assumed it'd  
> been discussed and was possibly forthcoming and was asking for  
> update.  Since it seems not to be the case, I think it should.
>
> Anyone else agree?
>

(Note: stripping a couple of addresses I know to be on the list.)

Quick summary to make sure I'm on the right track.

 From what I can gather, GROUP BY places results into arbitrary  
(named?) groups based on some criteria, presumably effectively a  
subquery. So, in your author/year example, you do the following:

- Group by author
- Sort groups by author
- Sort within groups by year.

In some SPARQL syntax we might say

SELECT ?author ?year WHERE {
   ?x foaf:maker ?author ;
      dc:date ?year .
}
ORDER BY ?year
GROUP BY ?author ORDERED BY ?author

i.e., order all of these results by year, then pick them out into  
ordered groups by author (thus keeping the year ordering within those  
groups). I don't know how ORDER BY would work on groups, or whether  
this supports all the operations that would be necessary (such as  
nesting).

Other than the output being explicitly clustered, easing processing,  
I'm not sure what GROUP BY achieves that multiple ORDER BY statements  
don't... it seems straightforward to me to do manipulation on the  
client when given a table, but then I don't use XSLT.

On one hand, some XMLey use cases (e.g., Bruce) would seem to benefit  
from explicit grouping. On the other hand:

• I don't think I'd ever use it
• it would complicate implementation of engines (including mine)
• it would complicate the results format (and thus all client tools)
• SPARQL is in Last Call.

Bruce, if you really want grouping, I'd suggest multiple queries:  
select your authors, then select their papers and years for each.  
Then you don't end up with n groups, you end up with n results sets  
which can be treated individually.

I'd have to vote against a grouping operation at the moment.

-R

Received on Tuesday, 30 August 2005 18:55:53 UTC