Re: Evaluation when there are errors in aggregates

On 01/03/11 12:02, Steve Harris wrote:
> On 2011-03-01, at 11:46, Andy Seaborne wrote:
>
>> I tried to make ARQ exactly follow the process in the spec and found that the aggregate tests don't seem to have any error coverage.
>>
>> - - - - - - - - - - - - - - - - - -
>>
>> Steve,
>>
>> I don't understand the new example in rq25:
>> --------------
>> PREFIX :<http://example.com/data/#>
>> SELECT ?g (AVG(?p) AS ?avg) ((MIN(?p) + MAX(?p)) / 2 AS ?c)
>> WHERE {
>>   ?g :p ?p .
>> }
>> GROUP BY ?g
>> --------------
>> Result:
>> ?g	?avg	?c
>> <x>	2.5	2.5
>> <z>	2.5	2.5
>> --------------
>>
>> Why not
>>
>> Result:
>> ?g	?avg	?c
>> <x>	2.5	2.5
>> <y>              2.5
>> <z>	2.5	2.5PREFIX xsd:     <http://www.w3.org/2001/XMLSchema#>
PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:     <http://www.w3.org/2002/07/owl#>
PREFIX fn:      <http://www.w3.org/2005/xpath-functions#>
PREFIX dc:      <http://purl.org/dc/elements/1.1/>
PREFIX apf:     <http://jena.hpl.hp.com/ARQ/property#>

SELECT ?book ?title
WHERE
    { ?book dc:title ?title }
>> representation
>
> Are you suggesting that the result with 3 solutions is correct as per working group decisions, or as per the algebra as it stands?

Yes to both.

I believe the WG decision is that AVG is an error.  MIN and MAX are not. 
AVG is an error because 1+"2" is an error. The error in AVG is handled 
by the SELECT expression mechanism for errors.

The algebra indirectly assumes the error handling but my suggested 
clarification spells it out.  If a set is defined as containing 
expression X and X is not an expression it isn't in the set implicitly.

(Unrelated: I think the fact the "MIN+MAX/2" is not an error is 
implementation dependent : the relationship of 1 and "2" is not 
prescribed only that hey are ordered in some way.

http://www.w3.org/TR/rdf-sparql-query/#modOrderBy

)

>> If AVG(?p) is an error, then the expression in the SELECT line is an error and so binding does not happen.
>>
>> I've worked through the formal definitions and it seems to come down to:
>>
>> eval(D(G), AggregateJoin(A, P) = { (aggi, eval(D(G), Ai)) | Ai in A }
>>
>> and eval(D(G), Ai) being an error.
>>
>> I suggest adding:
>>
>> eval(D(G), AggregateJoin(A, P) =
>>     { (aggi, eval(D(G), Ai)) | Ai in A , eval(D(G), Ai) not an error }
>>     # If eval(D(G), Ai) is an error, it is ignored.
>>
>> then the value for AVG is just not defined, and so (AVG(?p) AS ?avg) is handled by the usual mechanism.
>
> I was hoping to make it bubble up to use the same mechanism as Project Expressions, wouldn't think happen?
>
> The next stage after AggregateJoin will be a Project, and doesn't Project discard solutions that contain errors?

The next step out is any "extend" to do the binding of expression to the 
named variable.  That's where the error for AVG is discarded to my reading.

Project does not discard solutions. It can't change the number of rows 
at all. All project does is reduce the number of variables in the 
solution mapping.  It does not touch the value - and the value can't be 
"error" anyway.

>
> - Steve
>

	Andy

Received on Tuesday, 1 March 2011 12:28:52 UTC