Re: Mulgara and sameTerm from Seaborne, Andy on 2008-07-30 (public-sparql-dev@w3.org from July to September 2008)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Wed, 30 Jul 2008 09:54:29 +0100
To: Paul Gearon <gearon@ieee.org>
Cc: "public-sparql-dev@w3.org" <public-sparql-dev@w3.org>, Arjohn Kampman <arjohn@aduna-software.com>, Andrae Muys <andrae@netymon.com>, James Leigh <james-nospam@leighnet.ca>
Message-ID: <48902C45.4050803@hp.com>
Paul Gearon wrote:
> Thanks Andy, this does clear up a number of things for me.
> 
> On Tue, Jul 29, 2008 at 11:33 AM, Seaborne, Andy <andy.seaborne@hp.com 
> <mailto:andy.seaborne@hp.com>> wrote:
> <snip/>
> 
>     Most of the SPARQL filters require value space comparison.  The
>     definition of "=" allows extensibility by causing a type error if
>     two terms might be the same value but the processor does not know.
>      (Aside two literals are definitely equal if they are the same
>     lexical form and same datatype, for any datatype whether anything
>     else if know to the processor about it, because the lexical to value
>     space mapping of the datatype is functional.)
> 
> 
> This reminds me... exactly what is meant by "type error" here? The first 
> time I worked on this I threw an exception, but obviously that wasn't a 
> good idea and I fixed it.  :-)  At the moment, a "type error" is 
> effectively the same as not equals, which works, but has me 
> uncomfortable since I'm ignoring the distinction. (Actually, I'm still 
> using the exception internally, but I catch it and continue as if there 
> was no match)

(This was an area of teh spec that EricP did)

SPARQL defines a 3-state logic for evaluation.  True, False and Error.

Errors propagate so for almost everything "something(error)" is error. 
The exceptions are && and || so e.g. "true || error" is true and 
"error&&false" is false and not(false&&error) is true.

Eric put the truth table in sec 11.2
http://www.w3.org/TR/rdf-sparql-query/#evaluation

At the top-most level, FILTER (..error..) excludes the solution tested 
so error becomes false if you like as the 3-state logic collapses to a 
2-state logic.

> 
>     sameTerm works on the definition of equality from RDF Concepts so no
>     D-entailment. [B]  But SPARQL does not prescribe what is "in" the
>     store - there is dataset that is queried.  Especially in the case
>     where the dataset comes from execution context (no FROM etc, no
>     protocol parameter), SPARQL says nothing about how that dataset came
>     to be.  It just is.  So if you load RDF that has "+1"^^xsd:int,
>     whether the store preserves the exact lexical form, or it's
>     datatype, is a feature of the store.  SPARQL does not cover this
>     step.  If you load "+1"^^xsd:integer and "01"^^xsd:byte, it's a
>     store decision whether there are two terms or one, or whether what
>     is stored and returned is "1"^^xsd:integer which wasn't directly
>     mentioned (or even "1"^^xsd:decimal as the primitive XSD type that
>     they are all derived from).
> 
> 
> This was my understanding of how things work, though this implementation 
> decision for Mulgara was made by others. I'm glad to see that the 
> decision wasn't based on a misunderstanding. However, it *is* causing 
> problems for the test suite.... as you get to below.
> 
> <snip/>
> 
>     The test suite is a slightly different case: it is providing tests
>     for a specific set of choices.  The tests do label what the
>     assumptions are.  Some tests are labelled as making more than just
>     basic assumptions (e.g language tags).
> 
> 
> This is where we are coming unstuck. The tests are being treated as an 
> absolute, meaning that if we don't get exact correspondence in the 
> results we fail. Even if Mulgara is prepared to accept this, many 
> potential users are not. In our current scenario, Sesame is expecting 
> exact compliance with the tests as they are, and our current 
> architecture (storing values for known types, rather than lexical 
> representations) does not work here.

When the data is loaded from the manifest with qt:data etc, the file is 
assumed to be used as is.  Just basic simple entailment level - you have 
to start somewhere.  Tests are supposed to be annotated with mf:requires 
where they make assumptions.  Since adding triples is monotoinic :-) 
presumably anyone can add more to the test suite manifests at any time! 
  Seriously though - if theyare not labelled appropriately, could you at 
least email the dawg-comments list with a list of the tests in question 
and maybe something can be done (errata?).  At least if its on the 
mailing list then its somewhere other implementers can see it and any 
future WG can pick it up.

 Andy

> 
> I guess our problem comes down to the test suite being treated as a de 
> facto part of the standard.
> 
> Paul

-- 
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England
Received on Wednesday, 30 July 2008 08:55:14 UTC