- From: Steve Harris <steve.harris@garlik.com>
- Date: Tue, 5 Jun 2012 06:50:12 -0700
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: public-rdf-dawg@w3.org
On 5 Jun 2012, at 03:42, Andy Seaborne wrote:
>
> On 04/06/12 23:45, Steve Harris wrote:
>> On 3 Jun 2012, at 16:03, Andy Seaborne wrote:
>>
>>> Can aggregate expressions contain aggregates?
>>>
>>> This comes from a recent email [1] and while these may well not actually address the application goal (a subquery was meant - see thread), the SPARQL 1.1 Query spec does permit these unusual queries.
>>>
>>> Example:
>>>
>>> PREFIX ex:<http://example.org/meals#>
>>> SELECT (AVG(?mealPrice * (1.0 + MAX( ?mealTip / ?mealPrice)))
>>> AS ?avgCostWithBestTip)
>>> WHERE {
>>> ?description ex:mealPrice ?mealPrice .
>>> ?description ex:mealTip ?mealTip .
>>> } GROUP BY ?description
>>>
>>> i.e.an aggregate inside an aggregate: AVG(?x * MAX (?y) )
>>>
>>>
>>> One line of argument is that the expression inside the aggregate is applied to each row, so only row variables should be considered in-scope. The aggregate AVG(max(?x)+1) is violating that as max(?x) is not a per-row expression.
>
> (Birte) - yes this needs clarifying if we wish to rule it out, and possible even if we don't.
>
> As the spec stands, I *think* it says its not allowed:
>
> [[
> Definition Group:
>
> Group evaluates a list of expressions against a solution sequence
> ...
> ]]
>
> and the solution sequence is the grouped patterns, not after aggregation or select expressions.
>
> [[
> Definition: Aggregation
> ]]
> talks about applying the aggregate to the solution sequences collected into a map of key to multiset.
>
> i.e. the aggregate is evaluated over the pattern, not other aggregates and not select expressions
>
> Steve - opinion?
Yes, I think that the definition says that you wont get a result, but it could probably be clearer about what happens, error / unbound - or do you think it's sufficiently clear?
>>> What ARQ does is to calculate the aggregates of a group as the group streams past; it does not wait until the end of evaluation of the whole block when all the elements of all the groups are known.
>>>
>>>
>>> Related to this is the interaction with select expressions:
>>>
>>> SELECT (max(?x) As ?M) (avg(?M+1) AS ?A)
>>>
>>> because the select expression rules say you can use ?M inside AVG().
>>>
>>> If we wish to forbid this, we can do it quite easily by having a parser rule that aggregates can't appear in expression for the aggregate, which is a simple static check.
>>
>> Oh boy, it's certainly wacky.
>>
>> That parse rule wouldn't rule out the use of ?M above though anyway, would it?
>
> Complicated :-)
>
> As I read the spec, the ?Ms are different.
>
> (max(?x) As ?M) -- select expression
>
> avg(?M+1) -- undefined variable in the grouped pattern that is never mentioned or bound.
>
> Like writing
>
> avg(?noSuchVariable+1)
Right.
> Does any one have a use case that suggests it should be legal?
I don't see how it can really have any useful results.
- Steve
--
Steve Harris, CTO
Garlik, a part of Experian
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203 http://www.garlik.com/
Registered in England and Wales 653331 VAT # 887 1335 93
Registered office: Landmark House, Experian Way, NG2 Business Park, Nottingham, Nottinghamshire, England NG80 1ZZ
Received on Tuesday, 5 June 2012 13:51:06 UTC