Fwd: SUM aggregate operator and non-numeric literals

On the surface, Jeen's reasoning makes sense to me.

Steve, did we/you consider defining SUM instead of "+" instead of in 
terms of op:numeric-add?

Lee

-------- Original Message --------
Subject: SUM aggregate operator and non-numeric literals
Resent-Date: Thu, 23 Jun 2011 01:05:51 +0000
Resent-From: public-rdf-dawg-comments@w3.org
Date: Thu, 23 Jun 2011 13:05:10 +1200
From: Jeen Broekstra <jeen.broekstra@gmail.com>
To: public-rdf-dawg-comments@w3.org


Hi DAWG,

The current definition of SUM (section 18.4) is as follows :

==begin quote==
Definition: Sum
numeric Sum(multiset M)

The Sum set function is used by the SUM aggregate in the syntax.

Sum(M) = Sum(ToList(Flatten(M))).

Sum(S) = op:numeric-add(S1, Sum(S2..n)) when card[S] > 1
Sum(S) = op:numeric-add(S1, 0) when card[S] = 1
Sum(S) = 0 when card[S] = 0

In this way, Sum({1, 2, 3}) = op:numeric-add(1, op:numeric-add(2,
op:numeric-add(3, 0))).
==end quote==

Given that the definition of SUM is directly in terms of the
op:numeric-add XPath function, it follows that it can only be applied on
numeric literals. Therefore, any SUM that aggregates over a set of
values that contains a non-numeric type will result in a type error. Not
even an extension of the SPARQL operator table in section 17.3 will
help, as SUM is not defined in terms of those operators.

In other words, if we have the following data:

:a rdf:value "1" .
:a rdf:value "2"^^xsd:integer .
:b rdf:value "3"^^xsd:integer .

And the following query:

SELECT (SUM(?val) as ?value)
WHERE {
    ?a rdf:value ?val .
} GROUP BY ?a

The result will be always a type error.

I would argue that having the same extensibility mechanisms available
for SUM as we have for, for example, the + operator would be preferable.
That way, implementations wanting to offer a more forgiving version of
the SUM operator (one which silently ignores the non-numerics, for
example), could do so while staying spec-compliant.


Regards,

Jeen

Received on Saturday, 25 June 2011 15:19:08 UTC