# Re: Order of evaluation for aggregates

From: Steve Harris <steve.harris@garlik.com>
Date: Wed, 30 Nov 2011 11:30:44 +0000
Cc: sparql Working Group <public-rdf-dawg@w3.org>
Message-Id: <BFF642A9-E391-4A4A-B53E-73F3D646A5FF@garlik.com>
To: Andy Seaborne <andy.seaborne@epimorphics.com>
```Just a quick note, I'm (technically) on holiday this week, and will get back into this next week, hopefully.

- Steve

On 26 Nov 2011, at 22:59, Andy Seaborne wrote:

> Steve,
>
> I'm working through the definitions as they are in rq25 at the moment (Nov 26).
>
> I see no problem extending to ORDER BY : it works on ?agg_i and they are in-scope.
>
> ** are suggestions
>
>
> Q = SELECT ?x (1+count(*) as ?y) WHERE { ?x :p ?v } GROUP BY ?x
> P = BGP({ ?x :p ?v })
>
> > If Q contains GROUP BY exprlist
> >    Let G := Group(exprlist, P)
> > Else
> >    Let G := Group((1), P)
> >    End
>
> ## What about the case of no GROUP BY and no aggregate?
>   This catchall always groups a query
> ** ---------
> If Q contains GROUP BY exprlist
>   Let G := Group(exprlist, P)
> Else If Q contains an aggregate in SELECT, HAVING, ORDER BY
>   Let G := Group((1), P)
> Else
>   skip the rest of the aggregate step
>   End
> ** ---------
>
>
> G := Group(?x, BGP({ ?x :p ?v }))
> i:=1
>
> > For each (X AS Var) in SELECT and each HAVING(X) in Q
> so
>  X=1+count(*) Var = ?y
>
> > If X contains an unaggregated variable V
>
> ** s/Var/V/ in the For loop above.
>
> > For each aggregate R(args ; scalarvals) now in X
> aggregate R = count(*)
> A1 := Aggregation(*, count, {}, Group(?x, BGP({ ?x :p ?v })))
> > Replace R(...) with agg_1 in Q
>
> Q = SELECT ?x (1+?agg_1 as ?y) WHERE { ?x :p ?v } GROUP BY ?x
> ## Did you mean Q?
> ** Replace R(...) with agg_1 in X
> ## but X never gets mentioned again.
> ## Text seems to have lost an "extend" or assignment to E
> ## This does not do anything with (?y, ?agg_1)
> ** Add E := E append (?y, X)
>
> ## Otherwise the connection between A1 and ?agg_1 is lost.
>
> E := (?y, 1 + ?agg_1)
>
> i = 2
>
> > For each variable V appearing outside of an aggregate
> V = ?x
> A2 := Aggregation(?x, Sample, {}, G)
> E := (?y, 1 + ?agg_1) (?x, ?agg_2)
> i := 3
>
> P := AggregateJoin(A1, A2)
>
> ## Note -- we can do ORDER BY as well because ?agg varaibles never go out of scope and so continue until projection happens.  ORDER is before
> projection.
>
> "E is then used in 18.2.4.4"
>
> ----
> At this point:
> P = AggregateJoin(A1, A2)
> A1 = Aggregation(*, count, {}, Group(?x, BGP({ ?x :p ?v })))
> A2 = Aggregation(?x, Sample, {}, G)
> E = (?y, 1 + ?agg_1) (?x, ?agg_2)  # My correction
>
> ## Problem: how does A1 get associated with ?agg_1
>
> ## The connection of A1 to ?agg_1 didn't get recorded anywhere.
> ** add ?agg_i to "Aggregation"
>
> e.g. Aggregation(?agg_1, *, count, {}, Group(?x, BGP({ ?x :p ?v })))
>
> Example:
> Does not mention agg_i
> There is no E; I think it should be
>   E = [(?sum, ?agg_1) (?count, ?agg_2)]
>
> ------------------------------------------------------------------------
>
> 18.2.4.4 SELECT Expressions
>
> ** Needs noting it can be set earlier.
> Let E := [], a list of pairs of the form (variable, expression)
>
> ------------------------------------------------------------------------
>
> I'll try look at the evaluation next and also try to come up with definitions for Group, Aggregation, AggregateJoin to go with the eval defns. Also, I think "Aggregation" needs to know the name of the ?agg_i variable it is going to set (see above).
>
> One comment for now:
>
> "Note that if eval(D(G), Ai) is an error, it is ignored."
>
> Does "it" mean the error?
> And "ignore" mean the (agg_i, v_i) pair not included in the AggregateJoin?
>
> I think it needs to say that explicitly.
>
>
> Sorry about the layout - not sure how best to write comments and suggestions while showing the working.
>
> 	Andy
>

--
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11