Re: Nested Aggregate Expressions

On 5 Jun 2012, at 08:00, Birte Glimm wrote:

> [snip]
>>>> One line of argument is that the expression inside the aggregate is
>>>> applied to each row, so only row variables should be considered in-scope.
>>>>  The aggregate AVG(max(?x)+1) is violating that as max(?x) is not a per-row
>>>> expression.
>> 
>> (Birte) - yes this needs clarifying if we wish to rule it out, and possible
>> even if we don't.
>> 
>> As the spec stands, I *think* it says its not allowed:
>> 
>> [[
>> Definition Group:
>> 
>> Group evaluates a list of expressions against a solution sequence
>> ...
>> ]]
>> 
>> and the solution sequence is the grouped patterns, not after aggregation or
>> select expressions.
>> 
>> [[
>> Definition: Aggregation
>> ]]
>> talks about applying the aggregate to the solution sequences collected into
>> a map of key to multiset.
>> 
>> i.e. the aggregate is evaluated over the pattern, not other aggregates and
>> not select expressions
> 
> I think the definition just cannot handle the current case, but it is
> not forbidden, just undefined. IMO, either the definition has to be
> extended or the current case has to be forbidden. Maybe it is illegal
> due to some hidden constraints, but that should be made exlicit.

Well, another option is to make it (explicitly) undefined. ANSI C does that a lot.

- Steve

>>>> 
>>>> What ARQ does is to calculate the aggregates of a group as the group
>>>> streams past; it does not wait until the end of evaluation of the whole
>>>> block when all the elements of all the groups are known.
>>>> 
>>>> 
>>>> Related to this is the interaction with select expressions:
>>>> 
>>>> SELECT (max(?x) As ?M) (avg(?M+1) AS ?A)
>>>> 
>>>> because the select expression rules say you can use ?M inside AVG().
>>>> 
>>>> If we wish to forbid this, we can do it quite easily by having a parser
>>>> rule that aggregates can't appear in expression for the aggregate, which is
>>>> a simple static check.
>>> 
>>> 
>>> Oh boy, it's certainly wacky.
>>> 
>>> That parse rule wouldn't rule out the use of ?M above though anyway, would
>>> it?
>> 
>> 
>> Complicated :-)
>> 
>> As I read the spec, the ?Ms are different.
>> 
>> (max(?x) As ?M) -- select expression
>> 
>> avg(?M+1) -- undefined variable in the grouped pattern that is never
>> mentioned or bound.
>> 
>> Like writing
>> 
>> avg(?noSuchVariable+1)
>> 
>> -----
>> 
>> Turning this round:
>> 
>> Does any one have a use case that suggests it should be legal?
>> 
>>        Andy
>> 
>>> 
>>> - Steve
>>> 
>> 
> 
> 
> 
> -- 
> Jun. Prof. Dr. Birte Glimm            Tel.:    +49 731 50 24125
> Inst. of Artificial Intelligence         Secr:  +49 731 50 24258
> University of Ulm                         Fax:   +49 731 50 24188
> D-89069 Ulm                               birte.glimm@uni-ulm.de
> Germany
> 

-- 
Steve Harris, CTO
Garlik, a part of Experian 
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 653331 VAT # 887 1335 93
Registered office: Landmark House, Experian Way, NG2 Business Park, Nottingham, Nottinghamshire, England NG80 1ZZ

Received on Tuesday, 5 June 2012 18:28:31 UTC