Re: test suite changes (ACTION-291)

On Sep 12, 2013, at 1:39 PM, Antoine Zimmermann wrote:

> Thanks for the comments.
> 
> Quick response below.
> 
> 
> Le 12/09/2013 20:26, Pat Hayes a écrit :
>> Antoine, some quick hasty comments in-line. More careful response later today.
>> 
>> Pat (in haste)
>> 
>> On Sep 12, 2013, at 8:50 AM, Antoine Zimmermann wrote:
>> 
>>> Peter has done most of the work I was supposed to do (ACTION-292: Review the previous semantics test-suite)
>>> 
>>> I agree with his changes.
>>> 
>>> I have some more comments, and I propose a few more tests. Please Peter and Pat take a look at these to check I've not made mistakes. There are two questions where I'm not sure whether the graphs are satisfiable. If the tests are accepted, I'll provide N-Triples files for the premises and conclusions.
>>> 
>>> Comments on test cases:
>>> ======================
>>> I wonder whether the negative entailment test:
>>> 
>>> datatypes-intensional/test001.nt FALSE
>>> 
>>> is actually what it should be correct. It would be interesting to have the negatige test where the premise is the empty graph and the conclusion is what's currently in the premise, that is:
>>> 
>>> rdfms-seq-representation/empty.nt datatypes-intensional/test001.nt
>>> 
>>> The negative test where the premise file is "FALSE" and conclusion file has the triple "rdf:type rdf:type rdf:type" should be inverted (premise is the file with the triple, conclusion is FALSE).
>>> 
>>> The last two tests are the exact same as two previously mentionned tests.
>>> 
>>> 
>>> Additional entailment tests (in a format convenient for email):
>>> ==============================================================
>>> 
>>> The following RDF graph is FALSE in {xsd:string} entailment:
>>> ex:a  ex:p  "\0000" .
>> 
>> Really? Does \-escaping work inside quotes?
> 
> Sorry, it should have been "\u0000" . The syntax I use in this email is Turtle, but ultimately, all the test cases should be in N-triples. N-triples allow UTF encoding using \uXXXX
> See http://www.w3.org/TR/n-triples/#grammar-production-UCHAR

OK. It is interesting that the same literal is not a legal string but is OK in HTML. Ah, the joys of standardization. I feel for the poor programmer who has to write a syntax checker for RDFa embedded in HTML talking about XML with embedded HTML.

>>> Negative entailment:
>>> ex:a  ex:p  "\0000"^^rdf:HTML .
>>> is not FALSE in RDF recognizing rdf:HTML.
>>> 
>>> The following RDF graph:
>>> ex:a  rdf:type  rdf:langString .
>>> ex:a  rdf:type  xsd:string .
>>> is FALSE in RDF entailment.
>>> 
>>> The following RDF graph:
>>> rdf:langString  rdfs:subClassOf  xsd:string .
>>> is FALSE in RDFS entailment.
>>> 
>>> Negative entailment:
>>> ex:p  rdfs:range  xsd:integer .
>>> ex:a  ex:p  "abc"^^ex:dt .
>>> is not FALSE in RDFS recognizing xsd:integer (but not recognizing ex:dt).
>> 
>> This seems to be a negative negative entailment test, ie a consistency test case. Do we really want to go there?
> 
> The test is a negative entailment where the premise is the graph given above, and the conclusion is "FALSE". There are tests like this in the 2004 test cases.

Fair enough. 

> 
>>> 
>>> Negative entailment:
>>> ex:a  ex:p  "abc"^^ex:dt .
>>> does not entail:
>>> ex:a  ex:p  _:x .
>>> _:x  rdf:type  ex:dt .
>>> in RDFS where ex:dt is not recognized.
>> 
>> This is wrong. This entailment always holds. If "abc"^^ex:dt does not denote, then the premis is false so the entailment is trivial.
> 
> If ex:dt is not recognized, then "abc"^^ex:dt may denote an unknown resource.

Oh yes, of course. Sorry, I was focussing on the _:x rather than the type.

This is something of an anomaly, in fact, because this entailment is valid when ex:dt is recognized, for *any* ex:dt. So it would seem to be a 'safe' entailment whether the datatype is recognized or not. Hmmm. It would be fairly easy to fix this in section 7, but I guess it is too late to fix it now. 

> This is given by the condition on simple entailment, and conditions on D-, RDF-, and RDFS-entailment do not constrain the denotation of literals when the datatype is not recognized.
> 
> See relevant sections 5 to 9 of RDF 1.1 Semantics.
> 
>> 
>>> In RDF recognizing xsd:string, the empty graph entails:
>>> _:x  rdf:type  xsd:string .
>> 
>> and similarly for other non-empty datatypes. We require xsd:string and rdf:langString to be recognized in basic RDF.
> 
> True. I don't think we need a test case for each recognized datatype, though.

agreed.

>>> 
>>> The empty graph RDFS-entails:
>>> ex:a  rdf:type  rdfs:Resource .
>> 
>> That is an RDFS axiom, do we need it to be a test case?
> 
> Maybe, because it was not an axiom in RDF 2004.

Nice point. Yes, lets keep it in.

>>> The following RDF graph:
>>> ex:a  ex:p  "a"@en
>>> RDF-entails:
>>> ex:a  ex:p  _:x .
>>> _:x  rdf:type  rdf:langString .
>> 
>> and again, similarly for other datatypes.
> 
> True, but again, this is different from RDF 2004, where literals with a language tag did not have a datatype. This is to be sure that 2004 implementations that have not been updated do not pass the tests without this kind of safety checks.

OK, I see your motivation for including them now. 

>>> The following RDF graph:
>>> ex:p  rdfs:subPropertyOf  _:x .
>>> _:x  rdfs:range  ex:x .
>>> ex:a  ex:p  ex:b .
>>> RDFS-entails:
>>> ex:b  rdf:type  ex:x .
>> 
>> again, this follows directly from the RDFS entailment patters. Do we need this in test cases?
> 
> Maybe this one is not necessary, but it is to be sure that entailment engines also take into account properties that are blank nodes. Note that the old entailment rules in RDF 2004 were not considering this case and therefore these rules were not complete.

Again, I see the reasoning and agree it should be included. 

>>> 
>>> The following RDF graph is FALSE in RDFS recognizing xsd:string and xsd:integer:
>>> rdf:type  rdfs:range  xsd:integer .
>> 
>> Because xsd:string is not in the class xsd:integer? True. If we replaced xsd:integer by rdf:langString then we would have a similar contradiction in simple RDF.
> 
> You're right. Would you prefer a test case with the RDF entailment regime instead?

I think it would illustrate the same point more sharply.

>>> The following RDF graph:
>>> rdfs:Resource  rdfs:subClassOf  "a" .
>>> RDFS-entails:
>>> ex:a  rdf:type  "a" .
>> 
>> Why is this an interesting test case? The literal plays no special role here and the use of "a" with ex:a could be misleading.
> 
> Ok, maybe it is not particularly useful.
> 
> 
>> 
>>> 
>>> The following RDF graph is FALSE in RDFS recognizing {xsd:nonNegativeInteger,xsd:nonPositiveInteger}:
>>> rdf:Property  rdfs:subClassOf  xsd:nonNegativeInteger .
>>> rdf:Property  rdfs:subClassOf  xsd:nonPositiveInteger .
>> 
>> Why? It has the odd consequence that zero is the only property in the universe, but is this an actual inconsistency?
> 
> It is. It means that all properties are the number zero. So rdfs:subPropertyOf, rdf:type, etc are all number zero. So they are all interchangeable. So everything that has a type (that is, all resources) is also a subproperty of something, and therefore is a property. If something is a property, it is number zero. But number 1 is not number zero. Inconsistency.

And 1 has to be in the universe because it is the value of "+1"^^xsd:nonNegativeInteger. Yes, OK. 

But I wonder if we want this to be a test case. I don't think we should require RDF engines to be able to do complete reasoning on datatype entailment, even when they recognize a datatype. This is a *very* convoluted piece of inference. I don't know of any rule set that would detect it: it would have to be at least as powerful as an OWL-DL engine, able to reason about class cardinalities and identity and so on. I would prefer to avoid test cases which depend upon idiosyncracies of datatypes, and which detect obscure consequences of 'silly' assertions like asserting subclass conditions on the RDFS vocabulary itself. 

>>> 
>>> Is the following triple satisfiable in RDFS recognizing xsd:boolean?
>>> rdf:Property  rdfs:subClassOf  xsd:boolean .
>> 
>> Again, I don't see why not (at a quick glance, anyway.)
> 
> This one, at a quick glance, I could not tell.

I tried a little harder and couldn't make it be inconsistent or find a model. I suggest we just quietly forget about this one. Entailments that depend on cardinalities of datatype classes are kind of beside the point in any case, seems to me. Nobody, and I really do mean nobody, is ever going to use these in a real-life entailment situation. 

>>> The following RDF graph is FALSE in RDFS recognizing {xsd:nonNegativeInteger,xsd:nonPositiveInteger}:
>>> rdf:type  rdfs:range  xsd:nonNegativeInteger .
>>> rdf:type  rdfs:range  xsd:nonPositiveInteger .
>> 
>> Again, why? The conclusion from this seems to be that zero is the only class, which is strange but not inconsistent (I think). But in any case, what is the intended point being made by this example?
> 
> Same reasoning as before. Everything becomes a class, and there is only one class. So there is only one resource, which contradicts the existence of multiple numbers.

Yes. I had forgotten that the D-conditions mean that those numbers have to be in the universe. And in any case, RDF has all the strings in its universe. I would still prefer to omit this from the test cases, however. 

>>> 
>>> Is the following triple satisfiable in RDFS recognizing xsd:boolean?
>>> rdf:type  rdfs:range  xsd:boolean .
>> 
>> Sure, why not?
> 
> I'm missing the proof too.

Well, there are at most two classes. There are the rdfs named ones: Class, Resource, Literal, Datatype, Property and ContainerMembershipProperty, and of course xsd:boolean itself, so quite a lot of these must be identical. But that leaves a lot of possibilities open. Its quite consistent for Resource, Class and Property to be the same (that is the Common Logic assumption, for example) and I think its consistent to say that everything is a literal value. So we could make them all be T, say, and then F can be Datatype and CMProperty. Strange, but I think consistent. 

But as with the earlier cases, I think we should just not bother with strange datatype-idiosyncratic cases like this in the entailment tests. 

Pat

> 
> 
> AZ
> 
> 
>>> 
>>> The following RDF graph:
>>> ex:a  rdf:type  xsd:nonNegativeInteger .
>>> ex:a  rdf:type  xsd:nonPositiveInteger .
>>> ex:b  rdf:type  xsd:nonNegativeInteger .
>>> ex:b  rdf:type  xsd:nonPositiveInteger .
>>> ex:a  ex:p  ex:c .
>>> entails in RDF recognizing {xsd:nonNegativeInteger,xsd:nonPositiveInteger}:
>>> ex:b  ex:p  ex:c .
>>> 
>>> 
>>> 
>>> AZ
>>> 
>>> Le 12/09/2013 09:21, Pat Hayes a écrit :
>>>> 
>>>> On Sep 11, 2013, at 7:54 PM, Peter Patel-Schneider wrote:
>>>> 
>>>>> Changes required in RDF test suite to handle the changes in RDF 1.1
>>>>> 
>>>>> I actually went through Semantics to look for changes and then through the test suite to look for impacted tests.   I believe that the following more than covers what I signed up for ACTION-291, and leave it to others to do the bit-twiddling required to effect these changes.
>>>> 
>>>> I could try that twiddling if I knew how to access the test suite. Is it in mercurial somewhere?
>>>> 
>>>> Pat
>>>> 
>>>>> 
>>>>> peter
>>>>> 
>>>>> 
>>>>> 
>>>>> Areas of changes along with their handling
>>>>> - new handling of invalid literals - 1/, 2/, 3/, 4/
>>>>> - new datatypes - rdf:langString - 6a/ 6b/
>>>>>               - rdf:HTML - 6c/
>>>>> - "changes" to datatypes - xsd:string - 7/, 8/
>>>>> - entailment regimes - 5/ plus changes just below
>>>>> - RDF datasets - 9/, 10/
>>>>> 
>>>>> 
>>>>> Entailment regime changes (systematic)
>>>>> - change rules to regimes as follows
>>>>>   -> simple entailment
>>>>>   RDF -> RDF entailment
>>>>>   RDF + RDFS -> RDFS entailment
>>>>>   RDF + D(xsd:string)  -> RDF entailment
>>>>>   RDF + D(...) -> RDF entailment recognizing {rdf:langString,xsd:string,...}
>>>>>   RDF + RDFS + D(xsd:string)  -> RDFS entailment
>>>>>   RDF + RDFS + D(...) -> RDFS entailment recognizing {rdf:langString,xsd:string,...}
>>>>> 
>>>>> 
>>>>> Required test changes
>>>>> 
>>>>> 1/ datatypes-test002.nt datatypes/test002.nt CHANGE CONCLUSION TO FALSE
>>>>> 2/ datatypes-test002.nt datatypes-test002b.nt REMOVE
>>>>> 3/ xmlsch-02/test002.rdf xmlsch-02/test001.rdf NOW A POSITIVE TEST
>>>>> 4/ xmlsch-02/test002.rdf xmlsch-02/test003.rdf NOW A POSITIVE TEST
>>>>> 5/ rdfs-entailment/test001.rdf FALSE NOW RDFS entailment recognizing rdf:XMLLiteral
>>>>> 
>>>>> 
>>>>> Proposed test changes
>>>>> 
>>>>> 6a/ Add positive parsing test for valid rdf:langString
>>>>> 6b/ Add negative parsing test for invalid rdf:langString
>>>>> 6c/ Add positive parsing test for rdf:HTML
>>>>> 7/ Add positive RDF entailment entailing FALSE
>>>>>    ex:foo ex:bar "\0000"^^xsd:string
>>>>> 8/ Add positive RDF entailment entailing FALSE
>>>>>    ex:foo ex:bar "\0000"
>>>>> 9/ Add positive and negative parsing tests for RDF datasets
>>>>> 10/ Add tests for RDF dataset isomorphism
>>>>> 
>>>> 
>>>> ------------------------------------------------------------
>>>> IHMC                                     (850)434 8903 home
>>>> 40 South Alcaniz St.            (850)202 4416   office
>>>> Pensacola                            (850)202 4440   fax
>>>> FL 32502                              (850)291 0667   mobile (preferred)
>>>> phayes@ihmc.us       http://www.ihmc.us/users/phayes
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> --
>>> Antoine Zimmermann
>>> ISCOD / LSTI - Institut Henri Fayol
>>> École Nationale Supérieure des Mines de Saint-Étienne
>>> 158 cours Fauriel
>>> 42023 Saint-Étienne Cedex 2
>>> France
>>> Tél:+33(0)4 77 42 66 03
>>> Fax:+33(0)4 77 42 66 66
>>> http://zimmer.aprilfoolsreview.com/
>>> 
>>> 
>> 
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 home
>> 40 South Alcaniz St.            (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile (preferred)
>> phayes@ihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> -- 
> Antoine Zimmermann
> ISCOD / LSTI - Institut Henri Fayol
> École Nationale Supérieure des Mines de Saint-Étienne
> 158 cours Fauriel
> 42023 Saint-Étienne Cedex 2
> France
> Tél:+33(0)4 77 42 66 03
> Fax:+33(0)4 77 42 66 66
> http://zimmer.aprilfoolsreview.com/
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 home
40 South Alcaniz St.            (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile (preferred)
phayes@ihmc.us       http://www.ihmc.us/users/phayes

Received on Friday, 13 September 2013 04:50:46 UTC