W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > April to June 2011

Fwd: SUM aggregate operator and non-numeric literals

From: Lee Feigenbaum <lee@thefigtrees.net>
Date: Sat, 25 Jun 2011 11:18:39 -0400
Message-ID: <4E05FC4F.6040407@thefigtrees.net>
To: SPARQL Working Group <public-rdf-dawg@w3.org>
On the surface, Jeen's reasoning makes sense to me.

Steve, did we/you consider defining SUM instead of "+" instead of in 
terms of op:numeric-add?

Lee

-------- Original Message --------
Subject: SUM aggregate operator and non-numeric literals
Resent-Date: Thu, 23 Jun 2011 01:05:51 +0000
Resent-From: public-rdf-dawg-comments@w3.org
Date: Thu, 23 Jun 2011 13:05:10 +1200
From: Jeen Broekstra <jeen.broekstra@gmail.com>
To: public-rdf-dawg-comments@w3.org


Hi DAWG,

The current definition of SUM (section 18.4) is as follows :

==begin quote==
Definition: Sum
numeric Sum(multiset M)

The Sum set function is used by the SUM aggregate in the syntax.

Sum(M) = Sum(ToList(Flatten(M))).

Sum(S) = op:numeric-add(S1, Sum(S2..n)) when card[S] > 1
Sum(S) = op:numeric-add(S1, 0) when card[S] = 1
Sum(S) = 0 when card[S] = 0

In this way, Sum({1, 2, 3}) = op:numeric-add(1, op:numeric-add(2,
op:numeric-add(3, 0))).
==end quote==

Given that the definition of SUM is directly in terms of the
op:numeric-add XPath function, it follows that it can only be applied on
numeric literals. Therefore, any SUM that aggregates over a set of
values that contains a non-numeric type will result in a type error. Not
even an extension of the SPARQL operator table in section 17.3 will
help, as SUM is not defined in terms of those operators.

In other words, if we have the following data:

:a rdf:value "1" .
:a rdf:value "2"^^xsd:integer .
:b rdf:value "3"^^xsd:integer .

And the following query:

SELECT (SUM(?val) as ?value)
WHERE {
    ?a rdf:value ?val .
} GROUP BY ?a

The result will be always a type error.

I would argue that having the same extensibility mechanisms available
for SUM as we have for, for example, the + operator would be preferable.
That way, implementations wanting to offer a more forgiving version of
the SUM operator (one which silently ignores the non-numerics, for
example), could do so while staying spec-compliant.


Regards,

Jeen
Received on Saturday, 25 June 2011 15:19:08 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:46 GMT