Re: Test cases and examples for dataset entailment from Richard Cyganiak on 2012-09-11 (public-rdf-wg@w3.org from September 2012)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Tue, 11 Sep 2012 18:10:45 +0100
To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Cc: RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <3D039A32-A74F-4683-ABA4-C306C945B2C0@cyganiak.de>
On 11 Sep 2012, at 12:29, Antoine Zimmermann wrote:
> In "Basics": a dataset A is consistent with a dataset B if both can be true at the same time. What you write looks more like "modularity" in a logical sense.

Fixed.

> T1.1: classically, entailment would be defined between datasets. RDF entailment is between RDF graphs, FOL entailment is between FOL formulas, DL entailment is between DL ontologies, etc. I think it's better to say that RDF graphs are assimilated to RDF datasets that only have a default graph. Thus, with harmless abuse of notations, we can say:
> 
> { G } dataset-entails G
> 
> and
> 
> G dataset-entails { G }
> 
> Then you don't have to distinguish T1.X and T2.X.

To me, it seems simpler to consider dataset-interpretations as being able to interpret both RDF datasets and RDF graphs. Then we can talk of a dataset-interpretation as satisfying both RDF datasets and RDF graphs. And from that we get all the notions of entailment, contradiction, and so on, between graphs and datasets. This way, it requires *no* abuse of notation, not even harmless abuse ;-)

I think it's a fairly small matter and maybe should be best left to the editors.

> T2.X: we should not repeat tests that are already present in RDF test cases, OWL test cases, etc.

Note, I wrote the test cases just to illustrate the effects of the semantics, in order to facilitate discussion and give other WG members some confidence that the stuff we're proposing here isn't entirely absurd. At this point, I don't think it matters whether the test cases are redundant or complete, as long as they help us understand, and uncover potential issues.

> We can define meta-tests that can be used with the existing test cases to generate concrete tests automatically:

You're talking about a formal test suite, a la RDF Test Cases? Yes, for that purpose it would make sense to automatically generate such meta-tests.

What I wanted to do for now is just provide examples that show the semantics in action. For this purpose, I think “non-meta” test cases with actual triples work better.

It's probably worth revisiting these meta-tests once we talk about a formal suite of test cases.

<snip>
> 
> T5.2: this test is wrong. The empty graph does not entail
> 
> :n {}
> 
> in general.

You're right, good catch. Fixed by saying only that :g1 {} entails the empty dataset.

> To entail this, n must be in the vocabulary of the interpretation. This means that the empty graph RDF-dataset-entails:
> 
> rdf:type {}
> 
> for instance, but not:
> 
> owl:sameAs {}

Yeah, that's a bit weird. I guess one way of changing this would be to make IGEXT a *partial* function from the resources in I_d to the set of RDF graphs?

> That's one possible argument in favour of getting rid of the dependency to a vocabulary.

I think Pat mentioned that he'd prefer to do that, for all of RDF Semantics.

Perhaps best treated as an issue orthogonal to the semantics of datasets.

> T6.X simply say we have open world assumption. Any entailment valid on a subset of an RDF dataset are valid for said dataset.

Yup, but I think it's worth pointing out, for clarity. Again, this is not a proposal for a complete and formal test suite :-)

<snip>
> 
> There are a few more tests that are worth putting in:
> 
> An inconsistent graph in a named graph allows one to derive any conclusion within the graph, but does not affect other named graphs, nor the default graph.
> 
> """
> If a test case exists for entailment regime E such that G is E-inconsistent, then:
> 
> :n { G }  E-dataset-entails  :n { :s :p :o }
> 
> but does not entail:
> 
> { :s :p :o }
> 
> nor
> 
> :m { :s :p :o }
> """

Good one. Added (in non-meta style) as T12.1-3.

> IRIs in different named graphs can denote different things:
> 
> """
> { :s  owl:sameAS  "a" }
> :n { :s  owl:sameAs  "b" }
> :m { :s  owl:sameAs  "c" }
> 
> is OWL-dataset-consistent.
> """

Good one too! Added as T13.1.

> Graph IRIs do not necessarily denote a graph:
> 
> """
> { :n  owl:sameAs  "a" }
> :n { :s  :p  :o }
> 
> is OWL-dataset-consistent.
> """

Ah, nice way of showing this. Added as T11.3.

> Other tricky test case:
> 
> { :p  rdfs:range  xsd:boolean .
>   :s  :p  :n, :m, :o . }
> :n { :q  rdfs:range  xsd:string .
>      :x  :q  :y }
> :m { :q  rdfs:range  rdf:HTML .
>      :x  :q  :y }
> :o { :q  rdfs:range  rdf:langString .
>      :x  :q  :y}
> 
> is RDFS-dataset-inconsistent.
> """

Very clever! But I believe this is actually RDFS-dataset-consistent; it only becomes inconsistent under D-dataset-entailment?

I've added it as T14.1.

> Note that this would be consistent if IGEXT was a function from IRIs to graphs instead of resources to graph.

It would also be consistent if you wouldn't be using booleans as graph names :-)

Best,
Richard



> 
> 
> 
> HIH
> --AZ.
> 
> 
> Le 10/09/2012 18:30, Richard Cyganiak a écrit :
>> All,
>> 
>> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#Test_cases
>> 
>> I've added a number of test cases and examples to the Minimal
>> Dataset Semantics wiki page. There are also test cases that try to
>> specifically show what's at stake in the various open issues. No one
>> has reviewed this yet, so expect some errors.
>> 
>> I think this should be a good basis for further discussion. I think
>> it would be helpful for Wednesday's call if everyone had read through
>> these test cases. Please feel free to ask questions and propose
>> additional examples and test cases!
>> 
>> Two other things that I'd quite like to see before we can call the
>> proposal complete:
>> 
>> 1. Some thinking on how it addresses our graph use cases. (Do we have
>> an “official” list of those? I've lost track with all the various
>> documents.)
>> 
>> 2. Some examples for semantic extensions, in order to show that
>> various other proposed semantics can actually be done as proper
>> semantic extensions of this minimal dataset semantics.
>> 
>> Best, Richard
>> 
> 
> 
> -- 
> Antoine Zimmermann
> ISCOD / LSTI - Institut Henri Fayol
> École Nationale Supérieure des Mines de Saint-Étienne
> 158 cours Fauriel
> 42023 Saint-Étienne Cedex 2
> France
> Tél:+33(0)4 77 42 83 36
> Fax:+33(0)4 77 42 66 66
> http://zimmer.aprilfoolsreview.com/
>
Received on Tuesday, 11 September 2012 17:11:15 UTC