W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2009

Re: Semantics of SUM

From: Andy Seaborne <andy.seaborne@talis.com>
Date: Fri, 13 Nov 2009 17:05:08 +0000
Message-ID: <4AFD91C4.8030700@talis.com>
To: Steve Harris <steve.harris@garlik.com>
CC: RDF Data Access Working Group <public-rdf-dawg@w3.org>
I see treating unbound and errors different as inconsistent and that it 
should be up to the aggregate to decide what to do on errors (and 
unbounds) because ?x is as much an expression as ?x+0.

In the case of SUM(expression) if it skips unbound in SUM(?x) then I 
think it should skip SUM(?x+0).

 > In SQL, NULLs are dropped everywhere except COUNT(*), before the
 > aggregate function is invoked.

So I didn't know what SQL does so I tried MySQL and PostgreSQL ...

When an expression is used, then the rows for the invalid expression are 
dropped in MySQL including with implicit casting so it's not like just 
'+' where null+?? is null.

Summary :
   SQL standard requires nulls to be skipped in SUM.
   MySQL skips errors just like NULLs.
   PostgreSQL skips nulls and expression errors on sum(val+5)
   PostgreSQL causes an error on casting error.
   Addition with + and nulls does not work as sum over nulls.

Warning SQL follows:

** SQL 1: evaluation of +

mysql> select 1+"2"+null ;
+------------+
| 1+"2"+null |
+------------+
|       NULL |
+------------+

mysql> select 1+null ;
+--------+
| 1+null |
+--------+
|   NULL |
+--------+

** SQL 2: sum(val+5) with a null present:

mysql> select * from T ;
+------+
| val  |
+------+
|    1 |
|    2 |
| NULL |
+------+
3 rows in set (0.00 sec)

mysql> select sum(val) from T ;
+----------+
| sum(val) |
+----------+
|        3 |
+----------+
1 row in set (0.00 sec)

mysql> select sum(val+5) from T ;
+------------+
| sum(val+5) |
+------------+
|         13 |
+------------+
1 row in set (0.00 sec)


** SQL 3: sum(string+5) with a null and implicit casting:

mysql> select * from T ;
+------+------+
| val  | str  |
+------+------+
|    1 | 1    |
|    2 | 2    |
| NULL | 3    |
|    4 | NULL |
+------+------+
4 rows in set (0.00 sec)

mysql> select sum(str) from T ;
+----------+
| sum(str) |
+----------+
|        6 |
+----------+
1 row in set (0.00 sec)

mysql> select sum(str+5) from T ;
+------------+
| sum(str+5) |
+------------+
|         21 |
+------------+
1 row in set (0.00 sec)

** SQL 4: expression error:

mysql> select sum(1/(val-1)) from T ;
+----------------+
| sum(1/(val-1)) |
+----------------+
|         1.3333 |
+----------------+

or (1/(2-1)) + (1/(4-1))

which I didn't expect but makes sense (to me).

PostgreSQL does not support implicit casting:

AFS=# select * from T ;
  val | str
-----+------
    1 | 1
    2 | 2
      | 3
    4 |
    5 | text
(5 rows)

AFS=# select sum(val) from T ;
  sum
-----
   12
(1 row)

AFS=# select sum(str) from T ;
ERROR:  function sum(text) does not exist
LINE 1: select sum(str) from T ;
                ^
HINT:  No function matches the given name and argument types. You may 
need to add explicit type casts.

AFS=# select sum(cast(str as integer)) from T ;
ERROR:  invalid input syntax for integer: "text"

AFS=# select sum(val+5) from T ;
  sum
-----
   32
(1 row)

And now deleting the 'text' row:

AFS=# select sum(cast(str as integer)) from T ;
  sum
-----
    6
(1 row)

AFS=# select sum(val+5) from T ;
  sum
-----
   32
(1 row)

AFS=# select sum(1/(val-1)) from T ;
ERROR:  division by zero

and

AFS=# select 1+null AS X
AFS-#  ;
  x
---

(1 row)




	Andy

> In SQL, NULLs are dropped everywhere except COUNT(*), before the
> aggregate function is invoked.
>
> [examples use MySQL]
>
>  > SELECT * FROM C;
> +------+
> | lex |
> +------+
> | 5 |
> | 23 |
> | NULL |
> +------+
>
>  > SELECT COUNT(*) FROM C;
> +----------+
> | COUNT(*) |
> +----------+
> | 3 |
> +----------+
>
>  > SELECT COUNT(lex) FROM C;
> +------------+
> | COUNT(lex) |
> +------------+
> | 2 |
> +------------+
>
>  > SELECT SUM(CAST(lex AS unsigned)) FROM C;
> +----------------------------+
> | SUM(CAST(lex AS unsigned)) |
> +----------------------------+
> | 28 |
> +----------------------------+
> [note: no warnings]
>
> - Steve
>
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit
> http://www.messagelabs.com/email______________________________________________________________________
>
Received on Friday, 13 November 2009 17:05:19 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:40 GMT