W3C home > Mailing lists > Public > public-sparql-dev@w3.org > October to December 2013

Re: MIN and MAX in face of errors

From: Andy Seaborne <andy@apache.org>
Date: Tue, 15 Oct 2013 19:00:49 +0100
Message-ID: <525D82D1.8050109@apache.org>
To: public-sparql-dev@w3.org
Hi Jeremy,

I was hoping someone would have replied to make their case because when 
I looked at the details, I see different readings, including 
counter-intuitive behaviour.

Intuitively, for a mix of numbers and errors, one might expect:

  -(MIN(-?x)) == MAX(?x)
  -(MAX(-?x)) == MIN(?x)

> I am not at all the sure that the spec is clear about these cases

Which sections of the spec in particular are unclear?

There's a not-endorsed-by-any-WG errata page at:

http://www.w3.org/2013/sparql-errata

as input to any future WG and corrections/improvements

- - - - - - - - - - -

Any errors in the BIND lead to the variable assigned to unbound, and in 
particular, not an error value. 'error' as a new kind of value only 
exists in aggregation.


MAX is more interesting that MIN here because MIN comes up with the same 
answer for different readings; MAX does not.

Looking back over WG discussions, there was a lot of consideration of 
what happens about mixed types, e.g. numbers and strings, leading to the 
choice of the ordering relation for ORDER BY being used.  There is less 
discussion about the impact of errors coming from group evaluation.

There are two problems I can see, both related to the introduction of 
'error' as an extension to the set of values (RDF terms) in aggregation.

1/ ToList() is simply not defined for the case where there is an error.

You could argue that by SPARQL eval rules, the case of "not covered" 
means it itself is an error so Min(ToList(Flatten(M)))

Or you could argument ToList is supposed to just create a list and pass 
the error token through.  Then we have (2) below, with extended Ψ and 
then MIN(Error, 1) is 1.

2/ Similarly but less significantly because to matter it is a particular 
reading of (1), the ordering relation for ORDER BY does not deal with 
errors as input; it takes expressions and a solutions, evaluates the 
expressions and that may be an error.  OrderBy(Ψ, condition) and Ψ can't 
contain error. (sec 18.5).
Only the case of "condition(μ)" being an error is covered.

---------------------------

COUNT specifically calls errors out

N = Flatten(M)
remove error elements from N

SUM, AVG -- these end up with error either way round if we assume 
op:numeric-add(error, ...) is defined analogously to ToList(error) 
needed for that reading.

GroupConcat calls CONCAT so same here even though it does not use ToList.

SAMPLE does not use ToList; it contains the line:

"""
the only restriction is that the output value must be present in the 
input multiset.
"""

Can SAMPLE({"a", "b", "c", error}) validly return error?

---------------------------

This is worth an entry in the errata collection.

	Andy

On 10/10/13 18:21, Jeremy J Carroll wrote:
>
>
> I am wondering what is the right answer to the following query
>
> SELECT
>    (MIN(?x) as ?a)
>    (MAX(?x) as ?b)
>     (MIN(?x+0) ?c)
>     (MAX(?x+0) AS ?d)
> {
>    { BIND(1 as ?x)}
>    UNION
>    { BIND(1 +"x" as ?x)}
>    UNION
>    { BIND("y" as ?x)}
> }
>
> Note that the middle BIND creates an unbound ?x, (which is lower
> than1 or "y"), whereas the last BIND creates an error inside the 2nd MIN
> and MAXes … (when ?x="y")
>
> I am not at all the sure that the spec is clear about these cases
>
> Jeremy
>
>
> Principal Architect
> Syapse, Inc.
Received on Tuesday, 15 October 2013 18:01:23 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:15:52 UTC