Re: sparql sample and undefined values from Gregory Williams on 2016-02-01 (public-sparql-dev@w3.org from January to March 2016)

From: Gregory Williams <greg@evilfunhouse.com>
Date: Sun, 31 Jan 2016 21:27:51 -0800
To: james anderson <james@dydra.com>
Cc: public-sparql-dev@w3.org, public-rdf-tests@w3.org
Message-Id: <08ECFB98-5E0D-4356-8D22-DECB40FF1C96@evilfunhouse.com>

(cc'ing RDF Tests Community Group)

After reading this thread, I noticed that the SPARQL 1.1 test suite doesn’t do much (any?) to test aggregation when error/unbound values are in the group multiset. I’ve created two potential tests for SAMPLE and COUNT:

https://github.com/kasei/rdf-tests/commit/389617a278b737ef6d61af12dfb94f9175923cc0

thanks,
.greg




> On Jan 30, 2016, at 9:40 AM, james anderson <james@dydra.com> wrote:
> 
> good evening;
> 
>> On 2016-01-30, at 17:47, Andy Seaborne <andy@apache.org> wrote:
>> 
>> On 30/01/16 14:24, Jörn Hees wrote:
>>> Hi,
>>> 
>>>> What was RDFLib producing?
>>> 
>>> VALUES (?x ?ys ?zs) {
>>>   (3 UNDEF 15)
>>>   (5 UNDEF 25)
>>>   (2 6 UNDEF)
>>> }
>>> 
>>> 
>>>> Both are right, though the Virtuoso one is pragmatically more useful in this specific case. There is no one "right" in general when SAMPLE is involved. Aggregation calculation retains errors and ?z of UNDEF is an error. SAMPLE picks any value from the choices, and at that point, errors are "values". See ListEval.
>>> 
>>> I understand that sample can pick an arbitrary value from its choices.
>>> When it comes to error cases though, it seems this causes confusion as people might not expect an UNDEF to be a solution if there are other values to pick from... (undef has to be picked: (5 UNDEF 25) vs. can pick an actual value: (2 6 10)).
>>> 
>>> As you put it yourself it's pragmatically more useful, so would it hurt to put a preference like that into the standard?
>> 
>> The v1.1 standard is now fixed - but good suggestion for a prospective change.  I've added it to the errata document, linked to your report, which is the best I can do.
>> 
>> https://www.w3.org/2013/sparql-errata#errata-query-16
>> 
>> Changing the recommended behaviour of systems, even if "better", was something the 1.1 WG was loath to do, and chartered not to in the case of SPARQL 1.0 -> 1.1.  In other words, any system that has faithfully implemented the standard up to now should be respected and not be impacted.
>> 
>> In the meantime, getting the implementations to agree is the way forward. If that choice is the same everywhere, then it is a better case for future errata/clarification.
> 
> this would do well to go into the regression tests as a test profile parameter and service description property.
> 
>> 
>> I have raised JENA-1126 for Jena and proposed a change to pick a defined value.
>> 
>>  Thanks
>>  Andy
>> 
>>> In any case it would be cool if there was a small example (like the one above) that clarifies the behaviour.
>>> 
>>> Jörn
>> 
>> 
>> [JENA-1126]
>> https://issues.apache.org/jira/browse/JENA-1126
>> 
>> 
> 
> ---
> james anderson | james@dydra.com | http://dydra.com

Received on Monday, 1 February 2016 05:28:23 UTC