Re: Test cases and examples for dataset entailment from Antoine Zimmermann on 2012-09-11 (public-rdf-wg@w3.org from September 2012)

From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Date: Tue, 11 Sep 2012 21:54:33 +0200
To: Richard Cyganiak <richard@cyganiak.de>
CC: RDF Working Group WG <public-rdf-wg@w3.org>
Message-ID: <504F96F9.7020802@emse.fr>
Le 11/09/2012 19:10, Richard Cyganiak a écrit :
> On 11 Sep 2012, at 12:29, Antoine Zimmermann wrote:
>> In "Basics": a dataset A is consistent with a dataset B if both can
>> be true at the same time. What you write looks more like
>> "modularity" in a logical sense.
>
> Fixed.
>
>> T1.1: classically, entailment would be defined between datasets.
>> RDF entailment is between RDF graphs, FOL entailment is between FOL
>> formulas, DL entailment is between DL ontologies, etc. I think it's
>> better to say that RDF graphs are assimilated to RDF datasets that
>> only have a default graph. Thus, with harmless abuse of notations,
>> we can say:
>>
>> { G } dataset-entails G
>>
>> and
>>
>> G dataset-entails { G }
>>
>> Then you don't have to distinguish T1.X and T2.X.
>
> To me, it seems simpler to consider dataset-interpretations as being
> able to interpret both RDF datasets and RDF graphs. Then we can talk
> of a dataset-interpretation as satisfying both RDF datasets and RDF
> graphs. And from that we get all the notions of entailment,
> contradiction, and so on, between graphs and datasets. This way, it
> requires *no* abuse of notation, not even harmless abuse ;-)
>
> I think it's a fairly small matter and maybe should be best left to
> the editors.

Agreed.


>> T2.X: we should not repeat tests that are already present in RDF
>> test cases, OWL test cases, etc.
>
> Note, I wrote the test cases just to illustrate the effects of the
> semantics, in order to facilitate discussion and give other WG
> members some confidence that the stuff we're proposing here isn't
> entirely absurd. At this point, I don't think it matters whether the
> test cases are redundant or complete, as long as they help us
> understand, and uncover potential issues.

Ok, so it's all right.


>> We can define meta-tests that can be used with the existing test
>> cases to generate concrete tests automatically:
>
> You're talking about a formal test suite, a la RDF Test Cases? Yes,
> for that purpose it would make sense to automatically generate such
> meta-tests.
>
> What I wanted to do for now is just provide examples that show the
> semantics in action. For this purpose, I think “non-meta” test cases
> with actual triples work better.
>
> It's probably worth revisiting these meta-tests once we talk about a
> formal suite of test cases.

Ok.


> <snip>
>>
>> T5.2: this test is wrong. The empty graph does not entail
>>
>> :n {}
>>
>> in general.
>
> You're right, good catch. Fixed by saying only that :g1 {} entails
> the empty dataset.
>
>> To entail this, n must be in the vocabulary of the interpretation.
>> This means that the empty graph RDF-dataset-entails:
>>
>> rdf:type {}
>>
>> for instance, but not:
>>
>> owl:sameAs {}
>
> Yeah, that's a bit weird. I guess one way of changing this would be
> to make IGEXT a *partial* function from the resources in I_d to the
> set of RDF graphs?

Perhaps, have to check the implications.  Also, one way to change this 
is to do what Pat suggested, as you say below.

>
>> That's one possible argument in favour of getting rid of the
>> dependency to a vocabulary.
>
> I think Pat mentioned that he'd prefer to do that, for all of RDF
> Semantics.
>
> Perhaps best treated as an issue orthogonal to the semantics of
> datasets.

Sure.


>> T6.X simply say we have open world assumption. Any entailment valid
>> on a subset of an RDF dataset are valid for said dataset.
>
> Yup, but I think it's worth pointing out, for clarity. Again, this is
> not a proposal for a complete and formal test suite :-)

I understand. It's fine.

>
> <snip>
>>
>> There are a few more tests that are worth putting in:
>>
>> An inconsistent graph in a named graph allows one to derive any
>> conclusion within the graph, but does not affect other named
>> graphs, nor the default graph.
>>
>> """ If a test case exists for entailment regime E such that G is
>> E-inconsistent, then:
>>
>> :n { G }  E-dataset-entails  :n { :s :p :o }
>>
>> but does not entail:
>>
>> { :s :p :o }
>>
>> nor
>>
>> :m { :s :p :o } """
>
> Good one. Added (in non-meta style) as T12.1-3.
>
>> IRIs in different named graphs can denote different things:
>>
>> """ { :s  owl:sameAS  "a" } :n { :s  owl:sameAs  "b" } :m { :s
>> owl:sameAs  "c" }
>>
>> is OWL-dataset-consistent. """
>
> Good one too! Added as T13.1.
>
>> Graph IRIs do not necessarily denote a graph:
>>
>> """ { :n  owl:sameAs  "a" } :n { :s  :p  :o }
>>
>> is OWL-dataset-consistent. """
>
> Ah, nice way of showing this. Added as T11.3.
>
>> Other tricky test case:
>>
>> { :p  rdfs:range  xsd:boolean . :s  :p  :n, :m, :o . } :n { :q
>> rdfs:range  xsd:string . :x  :q  :y } :m { :q  rdfs:range  rdf:HTML
>> . :x  :q  :y } :o { :q  rdfs:range  rdf:langString . :x  :q  :y}
>>
>> is RDFS-dataset-inconsistent. """
>
> Very clever! But I believe this is actually RDFS-dataset-consistent;
> it only becomes inconsistent under D-dataset-entailment?

Absolutely right, my mistake. Basically, I wanted to avoid relying on 
OWL to show that the RDF semantics suite are providing difficult cases too.


> I've added it as T14.1.
>
>> Note that this would be consistent if IGEXT was a function from
>> IRIs to graphs instead of resources to graph.
>
> It would also be consistent if you wouldn't be using booleans as
> graph names :-)

Of course. But test cases have to contain corner cases. Those are the 
ones that make implementations robust.



AZ
>
> Best, Richard
>
>
>
>>
>>
>>
>> HIH --AZ.
>>
>>
>> Le 10/09/2012 18:30, Richard Cyganiak a écrit :
>>> All,
>>>
>>> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#Test_cases
>>>
>>>
>>>
I've added a number of test cases and examples to the Minimal
>>> Dataset Semantics wiki page. There are also test cases that try
>>> to specifically show what's at stake in the various open issues.
>>> No one has reviewed this yet, so expect some errors.
>>>
>>> I think this should be a good basis for further discussion. I
>>> think it would be helpful for Wednesday's call if everyone had
>>> read through these test cases. Please feel free to ask questions
>>> and propose additional examples and test cases!
>>>
>>> Two other things that I'd quite like to see before we can call
>>> the proposal complete:
>>>
>>> 1. Some thinking on how it addresses our graph use cases. (Do we
>>> have an “official” list of those? I've lost track with all the
>>> various documents.)
>>>
>>> 2. Some examples for semantic extensions, in order to show that
>>> various other proposed semantics can actually be done as proper
>>> semantic extensions of this minimal dataset semantics.
>>>
>>> Best, Richard
>>>
>>
>>
>> -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École
>> Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel
>> 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 83 36
>> Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/
>>
>
>


-- 
Antoine Zimmermann
ISCOD / LSTI - Institut Henri Fayol
École Nationale Supérieure des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 83 36
Fax:+33(0)4 77 42 66 66
http://zimmer.aprilfoolsreview.com/
Received on Tuesday, 11 September 2012 19:55:35 UTC