Re: CSV/TSV results test cases and suggested adaption of JSON results test cases from Axel Polleres on 2011-08-07 (public-rdf-dawg@w3.org from July to September 2011)

From: Axel Polleres <axel.polleres@deri.org>
Date: Sun, 7 Aug 2011 23:10:53 +0200
To: Gregory Williams <greg@evilfunhouse.com>
Cc: "SPARQL Working Group" <public-rdf-dawg@w3.org>
Message-Id: <55DA3733-316E-4A4D-BCD5-DC767B11DA1C@deri.org>

Hi Greg,

My focus was not on the aspect of datatype canonicalisation, but just on using an 
explicit datatype... So, I'd be totally fine in changing this to e.g. 

s/"5"^^xsd:decimal/"5.5"^^xsd:decimal

would that be ok for you?
Or shall we keep and mark this test case "as is"?
Or both?

Axel

On 7 Aug 2011, at 21:39, Gregory Williams wrote:

> On Aug 7, 2011, at 2:21 PM, Axel Polleres wrote:
> 
> > 1) I added (my understanding of) what CSV and TSV test cases should return in
> >
> >  http://www.w3.org/2009/sparql/docs/tests/data-sparql11/csv-tsv-res/
> >
> > the test cases are:
> >
> >   http://www.w3.org/2009/sparql/docs/tests/data-sparql11/csv-tsv-res/manifest#csv01
> 
> I'm finding these cases particularly hard to test due to potential canonicalization of datatyped literals on import. The data file contains this triple:
> 
> :s5 :p5 "5"^^xsd:decimal.
> 
> Which I believe (?) may be transformed into a canonical representation during import into the underlying store (as "5.0"^^xsd:decimal). If it is canonicalized on import and makes its way into the output serialization in the canonical form, though, it's not difficult to compare with the lossy csv results:
> 
> http://example.org/s5,http://example.org/p5,5,,
> 
> which has "5" as the corresponding csv value. In non-lossy result formats this wasn't a problem because the result record had the xsd:decimal type attached to the "5" value, and the comparison could be done using a D-entailment corresponding to the canonicalization process. Without that datatype information, though, it's impossible to know if "5" and "5.0" should compare as equal because "5" might have started out as an xsd:decimal (true), an xsd:string (false), or anything else that could produce that lexical form in the CSV results.
> 
> My questions are:
> 
> * Have I understood the issue correctly?
> * If so, is this just something I'm going to have to work around?
> * Could the tests be annotated in such a way as to indicate that this might be an issue (a la mf:feature)?
> * Could we add csv/tsv tests that don't have this canonicalization problem for the common xsd datatypes?
> 
> thanks,
> .greg
> 
>

Received on Sunday, 7 August 2011 21:11:23 UTC