Re: grouping by expressions

On 2010-11-03, at 10:05, Andy Seaborne wrote:
> On 03/11/10 07:11, Steve Harris wrote:
>> On 3 Nov 2010, at 02:42, Gregory Williams<greg@evilfunhouse.com>  wrote:
>> 
>>> On Nov 2, 2010, at 5:06 PM, Lee Feigenbaum wrote:
>>> 
>>>> I believe there are likely three options:
>>>> 
>>>> 1) To project grouping expressions, use BIND to alias the expression to a variable and then GROUP BY and project that variable (as above).
>>>> 
>>>> 2) Include an AS aliasing mechanism in GROUP BY, allow that alias to be projected in the SELECT clause
>>>> 
>>>> 3) Allow SELECT list aliases to be used in the GROUP BY expression
>>>> 
>>>> Can people please indicate on the mailing list which direction they'd like us to go on this, and we will then wrap this up on next Tuesday's telecon?
>>> 
>>> 3 seems backwards to me -- not really sure how it would work. I currently implement 2 and am happy with it, but 1 would seem to be reasonable also.
>> 
>> Agreed that 3 seems odd.
>> 
>> Preference for 1 as were going to have that mechanism anyway. Allowing AS in GROUP BY as well seems excessive, and will further complicate the algebra in that area.
>> 
>> - Steve
> 
> My preference is for 2.
> 
> It reduces the query author burden and is consistent with the style of explicit naming of used expressions we have in SELECT expressions.
> 
> This isn't about the algebra - it would be handled during translation from syntax to algebra.
> 
> GROUP BY (expr AS ?var)
> ==>
>  group (?var) .. aggregation pairs
>    extend (expr ?var)
>       ....
> 
> ARQ implements (2).  For SPARQL 1.1, which is the default, it enforces naming with AS; optionally, it will generate variables if needed (extended syntax, no AS).

I don't think we should /require/ AS if we add this syntax, there are situations where you want to group by an expression, but don't need to assign it to a variable, e.g.:

SELECT (AVG(?time) AS ?centre) (COUNT(*) AS ?magnitude)
WHERE {
   ?x a <Impulse> ;
      <timestamp> ?time .
}
GROUP BY round(?time * 1000)

Would seem a bit strange to have to write GROUP BY (round(?time * 1000) AS ?notneeded).

- Steve

-- 
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD

Received on Wednesday, 3 November 2010 10:59:51 UTC