W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > April to June 2012

Re: Nested Aggregate Expressions

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Tue, 05 Jun 2012 11:42:59 +0100
Message-ID: <4FCDE2B3.8000208@epimorphics.com>
To: public-rdf-dawg@w3.org


On 04/06/12 23:45, Steve Harris wrote:
> On 3 Jun 2012, at 16:03, Andy Seaborne wrote:
>
>> Can aggregate expressions contain aggregates?
>>
>> This comes from a recent email [1] and while these may well not actually address the application goal (a subquery was meant - see thread), the SPARQL 1.1 Query spec does permit these unusual queries.
>>
>> Example:
>>
>> PREFIX ex:<http://example.org/meals#>
>> SELECT (AVG(?mealPrice * (1.0 + MAX( ?mealTip / ?mealPrice)))
>>            AS ?avgCostWithBestTip)
>> WHERE {
>>   ?description ex:mealPrice ?mealPrice .
>>   ?description ex:mealTip ?mealTip .
>> } GROUP BY ?description
>>
>> i.e.an aggregate inside an aggregate:  AVG(?x * MAX (?y) )
>>
>>
>> One line of argument is that the expression inside the aggregate is applied to each row, so only row variables should be considered in-scope.  The aggregate AVG(max(?x)+1) is violating that as max(?x) is not a per-row expression.

(Birte) - yes this needs clarifying if we wish to rule it out, and 
possible even if we don't.

As the spec stands, I *think* it says its not allowed:

[[
Definition Group:

Group evaluates a list of expressions against a solution sequence
...
]]

and the solution sequence is the grouped patterns, not after aggregation 
or select expressions.

[[
Definition: Aggregation
]]
talks about applying the aggregate to the solution sequences collected 
into a map of key to multiset.

i.e. the aggregate is evaluated over the pattern, not other aggregates 
and not select expressions

Steve - opinion?


>>
>>
>> What ARQ does is to calculate the aggregates of a group as the group streams past; it does not wait until the end of evaluation of the whole block when all the elements of all the groups are known.
>>
>>
>> Related to this is the interaction with select expressions:
>>
>> SELECT (max(?x) As ?M) (avg(?M+1) AS ?A)
>>
>> because the select expression rules say you can use ?M inside AVG().
>>
>> If we wish to forbid this, we can do it quite easily by having a parser rule that aggregates can't appear in expression for the aggregate, which is a simple static check.
>
> Oh boy, it's certainly wacky.
>
> That parse rule wouldn't rule out the use of ?M above though anyway, would it?

Complicated :-)

As I read the spec, the ?Ms are different.

(max(?x) As ?M) -- select expression

avg(?M+1) -- undefined variable in the grouped pattern that is never 
mentioned or bound.

Like writing

avg(?noSuchVariable+1)

-----

Turning this round:

Does any one have a use case that suggests it should be legal?

	Andy

>
> - Steve
>
Received on Tuesday, 5 June 2012 10:43:31 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:48 GMT