W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2011

Re: Aggregates

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Thu, 08 Dec 2011 09:02:56 +0000
Message-ID: <4EE07D40.2080208@epimorphics.com>
To: public-rdf-dawg@w3.org


On 07/12/11 13:57, Steve Harris wrote:
> On 2011-12-07, at 13:00, Andy Seaborne wrote:
>
>>
>>
>> On 06/12/11 22:40, Steve Harris wrote:
>>> Hi all,
>>>
>>> I've now got the aggregates in a state where I think all the information is carried through from one end of the query to the other… but I've thought that before :)
>>>
>>> I also think ORDER BY is covered.
>>>
>>> Here's a sketch of what I think should be happening:
>>>
>>> Data
>>>
>>> <a>   <p>   1 .
>>> <a>   <p>   2 .
>>> <b>   <p>   3 .
>>>
>>> Query
>>>
>>> SELECT (MAX(?o) AS ?max) (MIN(?o) AS ?min)
>>> WHERE { ?s ?p ?o }
>>> GROUP BY ?s
>>> ORDER BY AVG(?o)
>>>
>>>
>>> Ω = Sol  ?s   ?p   ?o
>>>      μ1<a>    <p>    1
>>>      μ2<a>    <p>    2
>>>      μ3<b>    <p>    3
>>>
>>> G = Group((?s), Ω)
>>>    = { ((<a>), { μ1, μ2 }), ((<b>), { μ3 }) }
>>>
>>> Q = SELECT agg1 agg2
>>>      WHERE { ?s ?p ?o }
>>>      GROUP BY ?s
>>>      ORDER BY agg3
>>>
>>> E = { (?max, agg1), (?min, agg2) }
>>>
>>> A1 = Aggregation((?o), Max, {}, G)
>>> A2 = Aggregation((?o), Min, {}, G)
>>> A3 = Aggregation((?o), Avg, {}, G)
>>>
>>> J = AggregateJoin(A) =
>>>    { { (agg1, 2), (agg2, 1), (agg3, 1.5) }
>>>      { (agg1, 3), (agg2, 3), (agg3, 3) } }
>>
>>
>> This is the evaluation of AggregateJoin at execution time.
>>
>> I don't understand this step: how does it know the variables are agg1, agg2, and agg3? There could be other agg_i from other query levels. And why this order not agg3, agg2, agg1?
>
>  From the A, A has members 1, 2, and 3 in this case. A1 pairs with agg1 for e.g.
>
> If it were a lower query level it might have members 4, 5, and 6 for e.g.

Not quite: i is reset on every SELECT processed
"""
   # Note, i is global for the query, defaults to 1
   Let i := 1
"""

The comment might have that intent, but, to me, "Let" introduces a 
variable each time.

It is workable as a definition but rather unclear to me.  The fact it 
works relies on scoping features of variables so that the use of agg_1 
twice does not fall apart.

Also, the variable names are regenerated.  How does AggregateJoin know 
the variable is called "agg1" not "__gen1" because the query really does 
use ?agg1 in teh user written part?


I think it would have been easier to do it all in the translation and 
not have AggregateJoin which is really just a form of "extend" assigning 
Ai to agg_i.

Just rewriting to

extend( ?agg1 := Aggregation((?o), Max, {}, G),
         ?agg2 := Aggregation((?o), Min, {}, G),
         ?agg3 := Aggregation((?o), Avg, {}, G) )

would keep the aggregation and the variable together.

If the reviewers are comfortable with the form in the doc, then I can 
live with it but I think it works by relying on human reading and 
associating text.

	Andy
Received on Thursday, 8 December 2011 09:03:31 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:47 GMT