Pyhrro results and open-eq-12 (was: Re: DAWG Agenda - 11 Sep 2007 @ 14:30 UTC) from Seaborne, Andy on 2007-09-10 (public-rdf-dawg@w3.org from July to September 2007)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Mon, 10 Sep 2007 14:13:43 +0100
To: Lee Feigenbaum <lee@thefigtrees.net>
CC: RDF Data Access Working Group <public-rdf-dawg@w3.org>, Malcolm Crowe <CROW-CI0@wpmail.paisley.ac.uk>
Message-ID: <46E54307.4020602@hp.com>
cc'ed to Malcolm

Lee Feigenbaum wrote:
...
> 3b) Malcolm Crowe suggests that open-eq-12 is incorrect. Can anyone take 
> a look at it and agree or refute?
> 
>  
> http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2007Sep/0001.html

== Summary

The test looks right : it does assume some level of language tag handling but 
it is in the open-world test suite.

1/ Pyrrhodb seems to not test an IRI :xyz as equal or not equal to all the 
possible values in the test but, by RDFTerm-equals, an IRI term is either eaul 
or not equal, never an error.

2/ Pyrrhodb seems to not fully handle language tags (which is an optional 
extra for SPARQL).

== Details

open-eq-12 is the following:

------------------------------------

@prefix     : <http://example/> .
@prefix  xsd:    <http://www.w3.org/2001/XMLSchema#> .

:x1 :p "xyz" .
:x2 :p "xyz"@en .
:x3 :p "xyz"@EN .
:x4 :p "xyz"^^xsd:string .
:x5 :p "xyz"^^xsd:integer .
:x6 :p "xyz"^^:unknown .
:x7 :p _:xyz .
:x8 :p :xyz .

:y1 :q "abc" .
:y2 :q "abc"@en .
:y3 :q "abc"@EN .
:y4 :q "abc"^^xsd:string .
:y5 :q "abc"^^xsd:integer .
:y6 :q "abc"^^:unknown .
:y7 :q _:abc .
:y8 :q :abc .

------------------------------------

PREFIX     :    <http://example/>
PREFIX  xsd:    <http://www.w3.org/2001/XMLSchema#>

SELECT ?x ?v1 ?y ?v2
{
     ?x :p ?v1 .
     ?y :p ?v2 .
     OPTIONAL { ?y :p ?v3 . FILTER( ?v1 != ?v3 || ?v1 = ?v3 )}
     FILTER (!bound(?v3))
}

------------------------------------

DAWG results.
-------------------------------------------------------
| x   | v1                 | y   | v2                 |
=======================================================
| :x6 | "xyz"^^:unknown    | :x5 | "xyz"^^xsd:integer |
| :x6 | "xyz"^^:unknown    | :x4 | "xyz"^^xsd:string  |
| :x6 | "xyz"^^:unknown    | :x1 | "xyz"              |
| :x5 | "xyz"^^xsd:integer | :x6 | "xyz"^^:unknown    |
| :x5 | "xyz"^^xsd:integer | :x4 | "xyz"^^xsd:string  |
| :x5 | "xyz"^^xsd:integer | :x1 | "xyz"              |
| :x4 | "xyz"^^xsd:string  | :x6 | "xyz"^^:unknown    |
| :x4 | "xyz"^^xsd:string  | :x5 | "xyz"^^xsd:integer |
| :x1 | "xyz"              | :x6 | "xyz"^^:unknown    |
| :x1 | "xyz"              | :x5 | "xyz"^^xsd:integer |
-------------------------------------------------------

------------------------------------


In this test, ?v2 is the same as ?v3 on this data.

Pyhrro has 30 results, DAWG has 10.  Looking at the results given by Pyhrro 
that are not in the DAWG test suite answers for

?v1 =
   :xyz
   "xyz"@en

against all the possible ?v2/?v3.

All possible ?v3 are:

"xyz" .
"xyz"@en
"xyz"@EN .
"xyz"^^xsd:string .
"xyz"^^xsd:integer .
"xyz"^^:unknown .
_:xyz .
:xyz .

So it's 2 * 8 without self and bNode = 2*6
Then *2 to reverse ?v1, ?v3 but remove double counting for ?v3 = the ?v1 above 
gets 2*2*6 - 4 = 20 different results.

The test is determining when two terms are an error by RDFTerm-equals.


1/ Case ?v1 = :xyz an IRI.

An IRI is always not equal to anything that is not a IRI because the operator 
table will map the operation to RDFterm-equal(A, B) which becomes sameTerm, 
when one side is an IRI.  So RDFterm-equal(some IRI, B) is always true or false.

?v1=:xyz is causes the filter to be true for all ?v3, hence the optional is 
defined and the !bound removes the solution.

So I think Pyrrho is wrong - :xyz should not appear in the results.

That removes all the :x8 results in the Pyhrro example and one each of the 
:x1-:x6.  30-12 = 18.


2/ Case?v1 = "xyz"@en, an RDF plain literal with a language tag.

This is the remaining 8 (4 are an overlap with the case above).

Literals with language tags are not dispatched by the operator table
until you get to the RDFterm-equals operator.

RDF-equals:
[[
Returns TRUE if term1 and term2 are the same RDF term as defined in Resource 
Description Framework (RDF): Concepts and Abstract Syntax [CONCEPTS]; produces 
a type error if the arguments are both literal but are not the same RDF term 
[*]; returns FALSE otherwise. term1 and term2 are the same if any of the 
following is true:

     * term1 and term2 are equivalent IRIs as defined in 6.4 RDF URI 
References of [CONCEPTS].
     * term1 and term2 are equivalent literals as defined in 6.5.1 Literal 
Equality of [CONCEPTS].
     * term1 and term2 are the same blank node as described in 6.6 Blank Nodes 
of [CONCEPTS].
]]

Take one case ?v1 = "xyz"@en , ?v2 = "xyz"^^:unknown.

In RDF, a literal with a language tag can't be equal (same value) to a literal 
without one (simple literal or datatype literal) but this is not required by 
rq25 and applying the text:

a/ Not the same term so skip over the first part.
b/ type error

and hence with no understanding of the fact that language tags are distinct 
from simple literals, plain literals with a different language tag, typed 
literals and non-literals, it is an eval error and the test results would be 
different.  These are the open-world tests though and are correct when 
language tags are understood but other tests in this suite also require 
language tag understanding


The foot note text only discusses typed literals, it should cover language 
tags to make it complete from where it's mentioned.

[[
Invoking RDFterm-equal on two typed literals tests for equivalent values. An 
extended implementation may have support for additional datatypes. An 
implementation processing a query that tests for equivalence on unsupported 
datatypes (and non-identical lexical form and datatype IRI) returns an error, 
indicating that it was unable to determine whether or not the values are 
equivalent.
]]

The test is right for some level of lang tag understanding.  This should be 
marked in the manifest.

The removes ?x :x2 and :x3 (6) and 6 others but those involving :xyz are 
counted twice so it's 2*6-4 = 8, bring the number to 10 as for DAWG.

- - - - - - - -

Eric - please check my analysis!

 Andy
Received on Monday, 10 September 2007 13:14:06 UTC