Re: Open world value tests from Eric Prud'hommeaux on 2006-10-17 (public-rdf-dawg@w3.org from October to December 2006)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Tue, 17 Oct 2006 12:42:21 +0200
To: "Seaborne, Andy" <andy.seaborne@hp.com>
Cc: 'RDF Data Access Working Group' <public-rdf-dawg@w3.org>
Message-ID: <20061017104220.GA5918@w3.org>
On Thu, Aug 24, 2006 at 09:45:33PM +0100, Seaborne, Andy wrote:
> 
> """
> ACTION AndyS:
> Write some tests for value testing (unknown types and extensibility) to add 
> to
> 2006/JulSep0086
> """
> 
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JulSep/0086
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0104
> 
> I added some tests and sorted out the existing ones.  They are split into 
> sets
> so can be considered separately if desired.
> 
> http://www.w3.org/2001/sw/DataAccess/tests/data/OpenWorld/
> 
> Tests open-eq-01 to open-eq-06 are the original ones from 2006AprJun/0104,
> renamed.

I agree with these tests. These require no extension to SPARQL.
  | "open-eq-01"|"passed"| "001"^^xsd:integer !≘ "01"^^xsd:integer
  | "open-eq-02"|"passed"| "a"^^t:type1 ≘ "a"^^t:type1
  | "open-eq-03"|"passed"| 1 = 1, "01"^^xsd:integer = 1
  | "open-eq-04"|"passed"| 2 != 1, "02"^^xsd:integer != 1
  | "open-eq-05"|"passed"| "a"^^t:type1 = "a"^^t:type1
  | "open-eq-06"|"passed"| ∅ != "a"^^t:type1

I propose that the WG approve these tests.

> Tests open-eq-07 to open-eq-10 work by taking a list of all possible term
> forms, forming the cross product and seeing which are value-equal and
> value-not-equal.  This is done for data which contains the same compared
> values and different by comparable values.  These tests are exhaustive and
> include literals with lang tags - because lang tags are not case sensitive 
> (nor is there a canonical form according to RFC3066) it seemed reasonable 
> to be able equate "xyz"@EN with "xyz"@en. In effect, each lang tag defines 
> a separate value space - can't compare or test for equality across them, 
> but you can with the same language.
> 
> "abc"@en = "abc"@EN
> "xyz"@en > "abc"@en
> "xyz"@en > "abc"@EN

There is no current language for case-insensitive language tags in
SPARQL presently. My implementation failed these both because of
case-sensitive language matching, and because they employed extra
operators not currently in SPARQL.

open-eq-07:
  A = B   xsd:string     simple literal  op:numeric-equal(fn:compare(STR(A), STR(B)), 0)

Mostly they lean on the (apparent) fact that in ARQ, all known types
can be compared to other known types, as one can do with
sameTerm. I'll persue this in a later mail.


> Tests open-eq-11.rq and open-eq-12.rq look for pairs of terms that can be
> compared and pairs that can't.
> 
> Tests open-cmp-01, open-cmp-02 look for pairs that can be compared, not 
> just test for equality.  open-cmp-02 uses "<=" which can be implemented so 
> that things can pass <= even if they are not comparable because they are 
> unknown but pass equality test by sameTerm => sameValue.
> 
> These test work on pairs of literals in the data - if open-eq-07/10 are too
> coarse grained, they could be broken up into this style.  I just wrote the
> paired data later, but left the exhaustive approach of the earlier tests.
> 
> (sameTerm seems a slightly better name than sameNode because "RDF term" is
> phrase used in SPARQL and because predicates aren't nodes in the 
> graph-sense).
> 
> No sameTerm tests yet because they can't be written yet in pure SPARQL.
> 
> [[
> aside: I realised that testing for plain literals vs xsd:string can be done
> even if datatype("plain string") is xsd:string by a different route:
> 
>    sameTerm(?x, str(?x))
> 
> and this is the defining characteristic of plain literals.
> ]]
> 
> Tests date-1 to date-4 are some tests involving an extension type (xsd:date)
> including sorting.

Of these, I only passed date-4, which does not require any extension
to the SPARQL language as specified.
date-1:
  passed: "2006-08-23"^^xsd:date       = "2006-08-23"^^xsd:date
  failed: "2006-08-23Z"^^xsd:date      = "2006-08-23"^^xsd:date
  failed: "2006-08-23+00:00"^^xsd:date = "2006-08-23"^^xsd:date

date-2:
  failed: "2006-08-23"^^xsd:date       != "2006-08-23"^^xsd:date
  failed: "2006-08-23Z"^^xsd:date      != "2006-08-23"^^xsd:date
  failed: "2006-08-23+00:00"^^xsd:date != "2006-08-23"^^xsd:date

date-3:
  failed: "2006-08-23"^^xsd:date       > "2006-08-22"^^xsd:date
  failed: "2006-08-23Z"^^xsd:date      > "2006-08-22"^^xsd:date
  failed: "2006-08-23+00:00"^^xsd:date > "2006-08-22"^^xsd:date

date-4:
  passed: datatype("2006-08-23"^^xsd:date ) = xsd:date
  passed: datatype("2006-08-24Z"^^xsd:date) = xsd:date
  passed: datatype("2000-01-01"^^xsd:date ) = xsd:date

These tests bring up the issue of how we test extensions. I think all
we can do is test the bare language, writing tests that appear to
label an extended implementation as "failed".
-- 
-eric

home-office: +1.617.395.1213 (usually 900-2300 CET)
     +33.1.45.35.62.14
cell:       +33.6.73.84.87.26

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.
Received on Tuesday, 17 October 2006 10:41:27 UTC