W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2011

Re: Aggregates

From: Steve Harris <steve.harris@garlik.com>
Date: Wed, 7 Dec 2011 13:57:41 +0000
Cc: public-rdf-dawg@w3.org
Message-Id: <CB20C57E-56DE-4B70-BFF1-242F8B3FE9B3@garlik.com>
To: Andy Seaborne <andy.seaborne@epimorphics.com>
On 2011-12-07, at 13:00, Andy Seaborne wrote:

> 
> 
> On 06/12/11 22:40, Steve Harris wrote:
>> Hi all,
>> 
>> I've now got the aggregates in a state where I think all the information is carried through from one end of the query to the other… but I've thought that before :)
>> 
>> I also think ORDER BY is covered.
>> 
>> Here's a sketch of what I think should be happening:
>> 
>> Data
>> 
>> <a>  <p>  1 .
>> <a>  <p>  2 .
>> <b>  <p>  3 .
>> 
>> Query
>> 
>> SELECT (MAX(?o) AS ?max) (MIN(?o) AS ?min)
>> WHERE { ?s ?p ?o }
>> GROUP BY ?s
>> ORDER BY AVG(?o)
>> 
>> 
>> Ω = Sol  ?s   ?p   ?o
>>     μ1<a>   <p>   1
>>     μ2<a>   <p>   2
>>     μ3<b>   <p>   3
>> 
>> G = Group((?s), Ω)
>>   = { ((<a>), { μ1, μ2 }), ((<b>), { μ3 }) }
>> 
>> Q = SELECT agg1 agg2
>>     WHERE { ?s ?p ?o }
>>     GROUP BY ?s
>>     ORDER BY agg3
>> 
>> E = { (?max, agg1), (?min, agg2) }
>> 
>> A1 = Aggregation((?o), Max, {}, G)
>> A2 = Aggregation((?o), Min, {}, G)
>> A3 = Aggregation((?o), Avg, {}, G)
>> 
>> J = AggregateJoin(A) =
>>   { { (agg1, 2), (agg2, 1), (agg3, 1.5) }
>>     { (agg1, 3), (agg2, 3), (agg3, 3) } }
> 
> 
> This is the evaluation of AggregateJoin at execution time.
> 
> I don't understand this step: how does it know the variables are agg1, agg2, and agg3? There could be other agg_i from other query levels. And why this order not agg3, agg2, agg1?

From the A, A has members 1, 2, and 3 in this case. A1 pairs with agg1 for e.g.

If it were a lower query level it might have members 4, 5, and 6 for e.g.

Perhaps there's some clearer notation or text I should use to describe the kind of thing A is? It's a sequence I believe?

I looked into going what you suggested, and passing a quoted variable into Aggregation, but it makes it significantly more complex to read, and it looks like repetition to me.

I don't think the order matters, the result is a multiset, I just did it in that order for legibility.

> eval(D(G), AggregateJoin(A)) =
>   { (agg1, v1), ..., (aggn, vn) | vi such that ( k, vi ) in eval(D(G), Ai) for some k and each 1 <= i <= n }

Right.

- Steve

>> Result = OrderBy(Extend(Extend(J, ?min, agg2), ?max, agg1), agg3)
>> 
>> 
>> I'm now going to look at the @@s from earlier today.
>> 
>> - Steve
>> 
> 

-- 
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
Received on Wednesday, 7 December 2011 13:58:10 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:47 GMT