Re: Aggregates

I totally support Andy's point. I have pointed to this already for
LC1. One can somehow live with it, but there's a lot of vagueness and
we rely on implementors to fix the issues, e.g., you just cannot
simply use ?agg1 if ?agg1 is used by the user as also Andy points out.
It is kind of obvious that we assume implementors to take care fo
this, but it is not nice.

Similarly, I believe that starting the counter from 1 for each
(sub-)query works due to the scoping of variables and the projection
rules. These are, however, essential, which makes the definitions
quite fragile.

Birte

On 8 December 2011 10:02, Andy Seaborne <andy.seaborne@epimorphics.com> wrote:
>
>
> On 07/12/11 13:57, Steve Harris wrote:
>>
>> On 2011-12-07, at 13:00, Andy Seaborne wrote:
>>
>>>
>>>
>>> On 06/12/11 22:40, Steve Harris wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I've now got the aggregates in a state where I think all the information
>>>> is carried through from one end of the query to the other… but I've thought
>>>> that before :)
>>>>
>>>> I also think ORDER BY is covered.
>>>>
>>>> Here's a sketch of what I think should be happening:
>>>>
>>>> Data
>>>>
>>>> <a>   <p>   1 .
>>>> <a>   <p>   2 .
>>>> <b>   <p>   3 .
>>>>
>>>> Query
>>>>
>>>> SELECT (MAX(?o) AS ?max) (MIN(?o) AS ?min)
>>>> WHERE { ?s ?p ?o }
>>>> GROUP BY ?s
>>>> ORDER BY AVG(?o)
>>>>
>>>>
>>>> Ω = Sol  ?s   ?p   ?o
>>>>     μ1<a>    <p>    1
>>>>     μ2<a>    <p>    2
>>>>     μ3<b>    <p>    3
>>>>
>>>> G = Group((?s), Ω)
>>>>   = { ((<a>), { μ1, μ2 }), ((<b>), { μ3 }) }
>>>>
>>>> Q = SELECT agg1 agg2
>>>>     WHERE { ?s ?p ?o }
>>>>     GROUP BY ?s
>>>>     ORDER BY agg3
>>>>
>>>> E = { (?max, agg1), (?min, agg2) }
>>>>
>>>> A1 = Aggregation((?o), Max, {}, G)
>>>> A2 = Aggregation((?o), Min, {}, G)
>>>> A3 = Aggregation((?o), Avg, {}, G)
>>>>
>>>> J = AggregateJoin(A) =
>>>>   { { (agg1, 2), (agg2, 1), (agg3, 1.5) }
>>>>     { (agg1, 3), (agg2, 3), (agg3, 3) } }
>>>
>>>
>>>
>>> This is the evaluation of AggregateJoin at execution time.
>>>
>>> I don't understand this step: how does it know the variables are agg1,
>>> agg2, and agg3? There could be other agg_i from other query levels. And why
>>> this order not agg3, agg2, agg1?
>>
>>
>>  From the A, A has members 1, 2, and 3 in this case. A1 pairs with agg1
>> for e.g.
>>
>> If it were a lower query level it might have members 4, 5, and 6 for e.g.
>
>
> Not quite: i is reset on every SELECT processed
> """
>  # Note, i is global for the query, defaults to 1
>  Let i := 1
> """
>
> The comment might have that intent, but, to me, "Let" introduces a variable
> each time.
>
> It is workable as a definition but rather unclear to me.  The fact it works
> relies on scoping features of variables so that the use of agg_1 twice does
> not fall apart.
>
> Also, the variable names are regenerated.  How does AggregateJoin know the
> variable is called "agg1" not "__gen1" because the query really does use
> ?agg1 in teh user written part?
>
>
> I think it would have been easier to do it all in the translation and not
> have AggregateJoin which is really just a form of "extend" assigning Ai to
> agg_i.
>
> Just rewriting to
>
> extend( ?agg1 := Aggregation((?o), Max, {}, G),
>        ?agg2 := Aggregation((?o), Min, {}, G),
>        ?agg3 := Aggregation((?o), Avg, {}, G) )
>
> would keep the aggregation and the variable together.
>
> If the reviewers are comfortable with the form in the doc, then I can live
> with it but I think it works by relying on human reading and associating
> text.
>
>        Andy
>



-- 
Jun. Prof. Dr. Birte Glimm            Tel.:    +49 731 50 24125
Inst. of Artificial Intelligence         Secr:  +49 731 50 24258
University of Ulm                         Fax:   +49 731 50 24188
D-89069 Ulm                               birte.glimm@uni-ulm.de
Germany

Received on Thursday, 8 December 2011 10:41:31 UTC