# Re: Order of evaluation for aggregates

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Sat, 26 Nov 2011 22:59:59 +0000
Message-ID: <4ED16F6F.4010607@epimorphics.com>
To: sparql Working Group <public-rdf-dawg@w3.org>
```Steve,

I'm working through the definitions as they are in rq25 at the moment
(Nov 26).

I see no problem extending to ORDER BY : it works on ?agg_i and they are
in-scope.

** are suggestions

Q = SELECT ?x (1+count(*) as ?y) WHERE { ?x :p ?v } GROUP BY ?x
P = BGP({ ?x :p ?v })

> If Q contains GROUP BY exprlist
>    Let G := Group(exprlist, P)
> Else
>    Let G := Group((1), P)
>    End

## What about the case of no GROUP BY and no aggregate?
This catchall always groups a query
** ---------
If Q contains GROUP BY exprlist
Let G := Group(exprlist, P)
Else If Q contains an aggregate in SELECT, HAVING, ORDER BY
Let G := Group((1), P)
Else
skip the rest of the aggregate step
End
** ---------

G := Group(?x, BGP({ ?x :p ?v }))
i:=1

> For each (X AS Var) in SELECT and each HAVING(X) in Q
so
X=1+count(*) Var = ?y

> If X contains an unaggregated variable V

** s/Var/V/ in the For loop above.

> For each aggregate R(args ; scalarvals) now in X
aggregate R = count(*)
A1 := Aggregation(*, count, {}, Group(?x, BGP({ ?x :p ?v })))
> Replace R(...) with agg_1 in Q

Q = SELECT ?x (1+?agg_1 as ?y) WHERE { ?x :p ?v } GROUP BY ?x
## Did you mean Q?
** Replace R(...) with agg_1 in X
## but X never gets mentioned again.
## Text seems to have lost an "extend" or assignment to E
## This does not do anything with (?y, ?agg_1)
** Add E := E append (?y, X)

## Otherwise the connection between A1 and ?agg_1 is lost.

E := (?y, 1 + ?agg_1)

i = 2

> For each variable V appearing outside of an aggregate
V = ?x
A2 := Aggregation(?x, Sample, {}, G)
E := (?y, 1 + ?agg_1) (?x, ?agg_2)
i := 3

P := AggregateJoin(A1, A2)

## Note -- we can do ORDER BY as well because ?agg varaibles never go
out of scope and so continue until projection happens.  ORDER is before
projection.

"E is then used in 18.2.4.4"

----
At this point:
P = AggregateJoin(A1, A2)
A1 = Aggregation(*, count, {}, Group(?x, BGP({ ?x :p ?v })))
A2 = Aggregation(?x, Sample, {}, G)
E = (?y, 1 + ?agg_1) (?x, ?agg_2)  # My correction

## Problem: how does A1 get associated with ?agg_1

## The connection of A1 to ?agg_1 didn't get recorded anywhere.

e.g. Aggregation(?agg_1, *, count, {}, Group(?x, BGP({ ?x :p ?v })))

Example:
Does not mention agg_i
There is no E; I think it should be
E = [(?sum, ?agg_1) (?count, ?agg_2)]

------------------------------------------------------------------------

18.2.4.4 SELECT Expressions

** Needs noting it can be set earlier.
Let E := [], a list of pairs of the form (variable, expression)

------------------------------------------------------------------------

I'll try look at the evaluation next and also try to come up with
definitions for Group, Aggregation, AggregateJoin to go with the eval
defns. Also, I think "Aggregation" needs to know the name of the ?agg_i
variable it is going to set (see above).

One comment for now:

"Note that if eval(D(G), Ai) is an error, it is ignored."

Does "it" mean the error?
And "ignore" mean the (agg_i, v_i) pair not included in the AggregateJoin?

I think it needs to say that explicitly.

Sorry about the layout - not sure how best to write comments and
suggestions while showing the working.

Andy
```
Received on Saturday, 26 November 2011 23:00:33 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:01:05 UTC