- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Mon, 28 Nov 2011 12:20:34 +0000
- To: birte.glimm@uni-ulm.de
- CC: Steve Harris <steve.harris@garlik.com>, sparql Working Group <public-rdf-dawg@w3.org>
On 28/11/11 10:28, Birte Glimm wrote: > On 26 November 2011 23:59, Andy Seaborne<andy.seaborne@epimorphics.com> wrote: >> Steve, >> >> I'm working through the definitions as they are in rq25 at the moment (Nov >> 26). >> >> I see no problem extending to ORDER BY : it works on ?agg_i and they are >> in-scope. > > Agreed. > >> ## are comments >> ** are suggestions >> >> Q = SELECT ?x (1+count(*) as ?y) WHERE { ?x :p ?v } GROUP BY ?x >> P = BGP({ ?x :p ?v }) >> >>> If Q contains GROUP BY exprlist >>> Let G := Group(exprlist, P) >>> Else >>> Let G := Group((1), P) >>> End >> >> ## What about the case of no GROUP BY and no aggregate? >> This catchall always groups a query >> ** --------- >> If Q contains GROUP BY exprlist >> Let G := Group(exprlist, P) >> Else If Q contains an aggregate in SELECT, HAVING, ORDER BY >> Let G := Group((1), P) >> Else >> skip the rest of the aggregate step >> End >> ** --------- > > Good point. > >> G := Group(?x, BGP({ ?x :p ?v })) >> i:=1 >> >>> For each (X AS Var) in SELECT and each HAVING(X) in Q >> so >> X=1+count(*) Var = ?y >> >>> If X contains an unaggregated variable V >> >> ** s/Var/V/ in the For loop above. > > No. We arelooking at (X AS Var), but we now want to know whether X > contains a variable that is not aggregated. Obviously, variables in X > are different from Var. This is meant to handle things like (1+?x AS > ?y), where ?x is grouped, but not aggregated. This should become > (1+SAMPLE(?x) AS ?y). OK - I understand now. > >>> For each aggregate R(args ; scalarvals) now in X >> aggregate R = count(*) >> A1 := Aggregation(*, count, {}, Group(?x, BGP({ ?x :p ?v }))) >>> Replace R(...) with agg_1 in Q >> >> Q = SELECT ?x (1+?agg_1 as ?y) WHERE { ?x :p ?v } GROUP BY ?x >> ## Did you mean Q? > > I think yes. The modified Q is then used later, where we no longer > want to see aggregates, but instead ?aggi. The select expressions are > later turned into extends withh ?aggi variables. > >> ** Replace R(...) with agg_1 in X > In this case, one would assume that changes to X are propagated > through to Q. I think replacing in Q is clearer. > >> ## but X never gets mentioned again. >> ## Text seems to have lost an "extend" or assignment to E > ? The normal select expressions are handled later in 18.2.4.4. The > extends that we construct here are just to avoid errors because of > invalid select expressions. For example SELECT ?x { ... } GROUP BY ?x > should become something like SELECT (SAMPLE(?x) AS ?x) { ... } GROUP > BY ?x, but that wouldn't be valid. Maybe, now that we have > unaggregated variables that do not occur within an assigment in a > separate for loop, we could actually get w=away with rewriting Q again > here, e.g., into SELECT (?aggi AS ?x) ... > >> ## This does not do anything with (?y, ?agg_1) >> ** Add E := E append (?y, X) > This happens in 18.2.4.4. This is another way to dealing with lost ?agg_i to including the name in the aggregation(). It wasn't clear in my email. We add what is effectively (Ai AS ?agg_i) and doing it here means it's before 18.2.4.4.processing, effectively, making it at the start of the SELECT clause, and hence in-scope for all later expressions. Putting the variable name in the aggregation() is an alternative approach. Looking over this, I mildly prefer setting E to (Ai AS ?agg_i) then to passing the variable name to Aggregation() because that's passing quoted variables around. >> ## Otherwise the connection between A1 and ?agg_1 is lost. > > This is only clear from 18.5 Evaluation Semantics, where the aggi are > assigned in the evaluation of AggregateJoin. Section "18.4.1 Aggregate > Algebra" still misses definitions for Aggregation und AggregateJoin. OK - I now see how the agg_i are recreated in the eval of AggregateJoin but it does not quite work. The knowledge of variable choice can't be encoded in the "for i 1 to n". A query with multiple grouped aggregates fails because the counter starts from 1 each time. Several "agg_1" possible and it's not covered by "aggi is a temporary variable" because this evaluation definition is depending on the name format. The decision back in translation got lost. In the translation, we need a global (per query translation counter) in "step: Aggregation" not a per-group occurrence. Associate the allocated variable with the aggregation. Putting it all in one place: 1/ Allow aggregates in ORDER BY 2/ Skip if no grouping/aggregation 3/ Make the allocaiton of i for agg_i global 4/ Ensure that evaluation of AggregateJoin gets the right agg_i 3a/ Add (?agg_i, Ai) to E 3b/ Add ?agg_i into Aggragate() to carry through to the eval step. 5/ Defns of Aggregation, AggregateJoin and Group. 6/ Link (E is then used in 18.2.4.4) 7/ 18.2.4.4 -- setting of E 8/ Editorial: "Note that if eval(D(G), Ai) is an error, it is ignored." and check the evals are right. Andy
Received on Monday, 28 November 2011 12:21:15 UTC