- From: Andy Seaborne <andy.seaborne@talis.com>
- Date: Wed, 09 Jun 2010 15:31:04 +0100
- To: Steve Harris <steve.harris@garlik.com>
- CC: Lee Feigenbaum <lee@thefigtrees.net>, Axel Polleres <axel.polleres@deri.org>, SPARQL Working Group <public-rdf-dawg@w3.org>
On 09/06/2010 1:43 PM, Steve Harris wrote: > On 2010-06-09, at 10:23, Andy Seaborne wrote: >> >> On 09/06/2010 10:08 AM, Steve Harris wrote: >>>> which leads me to a fairly natural interpretation of >>>>> >>>>> SELECT ?s ?p >>>>> { >>>>> ?s ?p ?p >>>>> } GROUP BY ?s ?p >>>>> >>>>> as "null aggregation" >>> I don't understand the term "null aggregation". >> >> It is a term earlier in the thread to capture the idea that SELECT/GROUP with no aggregators mentioned fitted into the current framework with an implicit aggregator that did nothing. > > This is captured in the current draft with: > > "Definition: Group > > Group evaluates a list of expressions against a solution sequence, producing a set of partial functions from keys to solution sequences. > > The behaviour of Group is different when ExprList is empty. > > Group((), Ω) = { 1 -> Ω } That covers the case of SELECT (count(*) AS ?C) { ?s ?p ?o } It's the case where there is no count or other aggregator: SELECT ?s { ?s ?p ?o } GROUP BY ?s that trigger the idea of "null aggregation". Here there is no count or other aggregation. AggregateJoin(A) = { { aggi → range(Ai) } | dom(Ai) = k, k in set-union(dom(A)) } = {} > > Group(ExprList, Ω) = { ListEval(ExprList, μ) -> { μ' | μ' in Ω, ListEval(ExprList, μ) = ListEval(ExprList, μ') } | μ in Ω }" We have the group operation producing the functions from key to multiset (cardinality = cardinality of μ' in Ω). I'm looking for something that produces the query solution, given Maybe we need a algebra operation (group) group ( things to group by, aggregators to apply, # AggregateJoin pattern to work on ) -> set of { agg expression -> value } I chose the name "group" to make it the overall operation -- we already have Group(ExprList, Ω) which is different so maybe rename that as the partition function as per the language Chimie used and in draft of WD-sparql11-query-20091022/. That draft which also has a key() function to generate the keys needed to put into the query solution. It might be easier to make AggregateJoin assign the aggregation values to fresh variables, then assign to thier AS names for the case: AggregateJoin(A) = { (?fresh, aggi → range(Ai))( } ... # A pair (variable, value) SELECT ?s (sum(?o) AS ?sum) { ?s ?p ?o } HAVING (count(*) > 10) The project/select expression is then decoupled from the group/aggrgeation process (and has a filter in the way in the case anyway). Using the AggregateJoin aggregation as the expression name might also work but it needs general expressions changed to write HAVING (count(*) > 10) as there the reference to the result of count(*) isn't an RDF term or variables as expressions currently work with. Variables are a convenient way to deal with the value of an expression. Given your formalization of how aggregation happens, it looks like we have all the bit-and-pieces needed. Andy
Received on Wednesday, 9 June 2010 14:39:38 UTC