- From: Steve Harris <steve.harris@garlik.com>
- Date: Tue, 9 Feb 2010 17:23:56 +0000
- To: Andy Seaborne <andy.seaborne@talis.com>
- Cc: Lee Feigenbaum <lee@thefigtrees.net>, "public-rdf-dawg@w3.org Group" <public-rdf-dawg@w3.org>
On 9 Feb 2010, at 13:53, Andy Seaborne wrote:
> On 09/02/2010 1:34 PM, Steve Harris wrote:
...
>> if M = your solution multiset above.
>> M' = M(fn:floor((?x+1)/2))) { 1, 1, 1 }
>> M'' = DISTINCT M' { 1 }
>> result = Count(M'') 1
>>
>> This is how aggregates are defined in SQL, and I can't think of any
>> pressing reason to depart from that.
>>
>>> which is applying the DISTINCT after the implicit projection (case
>>> 1)
>>> and after expression evaluation (case 2).
>>>
>>> I thought Steve and I were mostly agreeing, except over whether one
>>> can name the DISTINCT and non-DISTINCT versions with URIs.
>>
>> That's the essence I think. But maybe you're proposing something
>> different?
>
> An aggregate function takes the partition as a multiset and also
> it's and its expressions. It decides on DISTINCT or not - this is
> moving the ExprMultiset machinery inside the aggregate.
I don't find that very tempting, it moves a lot of unnecessary
complexity into the aggregate, rather than it being a simple set
function.
> It has a very similar effect but allows the aggregate to decide how
> to handle unbounds, expression errors etc (in SQL, it is the
> aggregate function that decides although all the standard ones do
> the same).
Not in ANSI SQL.
COUNT(*) is a special case.
Otherwise, you eliminate NULL values that result in the application of
the expression to the table T, giving TX.
If DISTINCT is specified, then let TXA be the result of eliminating
redundant duplicate values from TX. Otherwise, let TXA be TX.
Then you apply the "set function" (aggregate) to TXA.
There's a discussion of this at http://www.dbazine.com/ofinterest/oi-articles/celko14
> Different URIs name different aggregators so can have different
> DISTINCT effects.
>
> I found this the most concise description:
> http://www.postgresql.org/docs/8.4/static/sql-expressions.html#SYNTAX-AGGREGATES
Note that Postgres does not require you to define separate aggregates
for the DISTINCT and ALL behaviours either, even for user-defined
aggregates.
The SQL spec in this area is unusually readable, and quite clear. The
only obvious way that Postgres differs is that it allows user-defined
aggregates to regard NULL values, but that syntax page doesn't really
say very much.
- Steve
--
Steve Harris, Garlik Limited
2 Sheen Road, Richmond, TW9 1AE, UK
+44 20 8973 2465 http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10
9AD
Received on Tuesday, 9 February 2010 17:24:29 UTC