Re: Order of evaluation for aggregates

Just a quick note, I'm (technically) on holiday this week, and will get back into this next week, hopefully.

- Steve

On 26 Nov 2011, at 22:59, Andy Seaborne wrote:

> Steve,
> 
> I'm working through the definitions as they are in rq25 at the moment (Nov 26).
> 
> I see no problem extending to ORDER BY : it works on ?agg_i and they are in-scope.
> 
> ## are comments
> ** are suggestions
> 
> 
> Q = SELECT ?x (1+count(*) as ?y) WHERE { ?x :p ?v } GROUP BY ?x
> P = BGP({ ?x :p ?v })
> 
> > If Q contains GROUP BY exprlist
> >    Let G := Group(exprlist, P)
> > Else
> >    Let G := Group((1), P)
> >    End
> 
> ## What about the case of no GROUP BY and no aggregate?
>   This catchall always groups a query
> ** ---------
> If Q contains GROUP BY exprlist
>   Let G := Group(exprlist, P)
> Else If Q contains an aggregate in SELECT, HAVING, ORDER BY
>   Let G := Group((1), P)
> Else
>   skip the rest of the aggregate step
>   End
> ** ---------
> 
> 
> G := Group(?x, BGP({ ?x :p ?v }))
> i:=1
> 
> > For each (X AS Var) in SELECT and each HAVING(X) in Q
> so
>  X=1+count(*) Var = ?y
> 
> > If X contains an unaggregated variable V
> 
> ** s/Var/V/ in the For loop above.
> 
> > For each aggregate R(args ; scalarvals) now in X
> aggregate R = count(*)
> A1 := Aggregation(*, count, {}, Group(?x, BGP({ ?x :p ?v })))
> > Replace R(...) with agg_1 in Q
> 
> Q = SELECT ?x (1+?agg_1 as ?y) WHERE { ?x :p ?v } GROUP BY ?x
> ## Did you mean Q?
> ** Replace R(...) with agg_1 in X
> ## but X never gets mentioned again.
> ## Text seems to have lost an "extend" or assignment to E
> ## This does not do anything with (?y, ?agg_1)
> ** Add E := E append (?y, X)
> 
> ## Otherwise the connection between A1 and ?agg_1 is lost.
> 
> E := (?y, 1 + ?agg_1)
> 
> i = 2
> 
> > For each variable V appearing outside of an aggregate
> V = ?x
> A2 := Aggregation(?x, Sample, {}, G)
> E := (?y, 1 + ?agg_1) (?x, ?agg_2)
> i := 3
> 
> P := AggregateJoin(A1, A2)
> 
> ## Note -- we can do ORDER BY as well because ?agg varaibles never go out of scope and so continue until projection happens.  ORDER is before
> projection.
> 
> "E is then used in 18.2.4.4"
> ** Link needed.
> 
> ----
> At this point:
> P = AggregateJoin(A1, A2)
> A1 = Aggregation(*, count, {}, Group(?x, BGP({ ?x :p ?v })))
> A2 = Aggregation(?x, Sample, {}, G)
> E = (?y, 1 + ?agg_1) (?x, ?agg_2)  # My correction
> 
> ## Problem: how does A1 get associated with ?agg_1
> 
> ## The connection of A1 to ?agg_1 didn't get recorded anywhere.
> ** add ?agg_i to "Aggregation"
> 
> e.g. Aggregation(?agg_1, *, count, {}, Group(?x, BGP({ ?x :p ?v })))
> 
> Example:
> Does not mention agg_i
> There is no E; I think it should be
>   E = [(?sum, ?agg_1) (?count, ?agg_2)]
> 
> ------------------------------------------------------------------------
> 
> 18.2.4.4 SELECT Expressions
> 
> ** Needs noting it can be set earlier.
> Let E := [], a list of pairs of the form (variable, expression)
> 
> ------------------------------------------------------------------------
> 
> I'll try look at the evaluation next and also try to come up with definitions for Group, Aggregation, AggregateJoin to go with the eval defns. Also, I think "Aggregation" needs to know the name of the ?agg_i variable it is going to set (see above).
> 
> One comment for now:
> 
> "Note that if eval(D(G), Ai) is an error, it is ignored."
> 
> Does "it" mean the error?
> And "ignore" mean the (agg_i, v_i) pair not included in the AggregateJoin?
> 
> I think it needs to say that explicitly.
> 
> 
> Sorry about the layout - not sure how best to write comments and suggestions while showing the working.
> 
> 	Andy
> 

-- 
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD

Received on Wednesday, 30 November 2011 11:31:25 UTC