Re: Really minimal dataset semantics from Peter F. Patel-Schneider on 2012-09-25 (public-rdf-wg@w3.org from September 2012)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Tue, 25 Sep 2012 08:32:51 -0400
To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
CC: public-rdf-wg@w3.org
Message-ID: <5061A473.1030809@gmail.com>
I don't think that I am being inconsistent here at all.

I would prefer not having entailment defined for RDF datasets.  I am fully 
opposed to not having interpretations defined for RDF datasets.   If 
entailment is defined for RDF datasets, then I believe it should be defined in 
a way that does not introduce interpretations (meanings) for RDF datasets but 
depends on interpretations for the RDF graphs in the datasets. I worry that 
even any such definition would be too constraining, and particularly worry 
that definitions of RDF dataset entailment that are insensitive to the exact 
form of the named graphs are too broad.

For example, one could define RDF dataset entailment as 1/ the default graphs 
entail, and 2/ each named graph in the second dataset has a graph with the 
same name in the first dataset that entails it (or, maybe, it is entailed by 
the empty graph).   This is insensitive to the exact form of the named graphs, 
which worries me as supporting too many entailments, but I would not object 
too strongly because there is no notion of interpretations for RDF datasets.  
I reluctantly support this notion of RDF dataset entailment, however, and 
would prefer not to have it.  Use cases that depend on the exact form of the 
named graphs would simply ignore RDF dataset entailment, and there would have 
to be explicit language that this is perfectly in tune with the intended use 
of RDF datasets.

One could also define RDF dataset entailment as 1/ the default graphs entail, 
and 2/ each named graph in the second dataset has a graph with the same name 
in the first dataset that is isomorphic to it up to invertable renaming of 
blank nodes.   This notion of RDF dataset entailment has the desirable, to me, 
property that it is sensitive to the exact form of the named RDF graphs, which 
I think is necessary to support many of the use cases brought forward in the 
working group.  I have not promoted this definition because it adds new 
machinery (isomorphic up to ...).

peter




On 09/25/2012 05:12 AM, Antoine Zimmermann wrote:
> Le 21/09/2012 17:44, Peter F. Patel-Schneider a écrit :
>> I don't think that this is converging, even on technical grounds.
>>
>> My view is that many or most of the use cases in
>> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs-UC have to do with talking
>> about actual graphs.
>>
>> My view is that having semantic interpretations of RDF datasets being
>> insensitive to the actual named graph doesn't fully support these use
>> cases. I'm even a bit worried about defining entailment between RDF
>> datasets that doesn't distinguish between equivalent named graphs.
>
> What? At first, you were arguing against any semantics for dataset. Now you 
> ask a fairly strong requirement on the semantics, where entailments between 
> named graphs are not possible. And recently, you argued in favour of a very 
> minimal semantics, where all that is said is, if G graph-entail G' then {G} 
> dataset-entails {G'} and <n>{G} dataset-entails <n>{G'}.
>
> So what is your position after all?
>
> No semantics fails to solve the use cases where the graph name should denote 
> a graph container. Additionally, it fails to satisfy my requirement that if 
> G entails G' then <n,G> entails <n,G'>. Each use cases where semantics come 
> into play will have to assume there own semantics, with no indication from a 
> spec that others would assume the same semantics.
>
> A semantics where, in order to make a named graph true, has to associate the 
> graph name rigidly with the graph inside prevents all entailments of the 
> form "<n>{G} entails <n>{G'}" and even prevents any proper semantic 
> extensions to do so.
>
> With the proposed minimal semantics, there is no more support for talking 
> about graphs than with no semantics, but at least it solves the use cases 
> where one wants to reason separately on named graphs, and it does not 
> prevent one to talk about containers or anything else, any more than RDF 
> semantics prevents to talk about people, documents, or anything else. It 
> also makes it possible to define extensions or vocabularies that more 
> formally constrain certain things to be graph containers, for instance, in a 
> way that is consistent with the minimal semantics.
>
> If you compare to RDF, that's how things have worked all the time. RDF 
> semantics does not satisfy most of the use cases, but it can be used to 
> produce vocabularies, semantic extensions, etc that are able to satisfy the 
> UCs.
>
>
>
> AZ
>
>>
>> peter
>>
>> PS: From http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs-UC I see at least
>> 1.1,
>>
>>
>>
>> On 09/21/2012 11:33 AM, Antoine Zimmermann wrote:
>>> Le 21/09/2012 17:25, Peter F. Patel-Schneider a écrit :
>>>>
>>>> On 09/20/2012 12:00 PM, Antoine Zimmermann wrote:
>>>>> Le 20/09/2012 16:54, Peter F. Patel-Schneider a écrit :
>>>>>>
>>>
>>> [skip]
>>>
>>>>>>
>>>>>> In the semantics there is no notion of a relationship between a
>>>>>> name and
>>>>>> an actual graph.
>>>>>
>>>>> In the semantics to which I refer (viz., first version of the Minimal
>>>>> dataset semantics) there is a function IGEXT that maps graph IRIs to
>>>>> RDF
>>>>> graphs. Isn't this a notion of a relationship between a name and an
>>>>> actual
>>>>> graph?
>>>> No, as it does not distinguish between equivalent graphs. Suppose you
>>>> have two equivalent graphs, then you can use them interchangeably in
>>>> your semantics.
>>>
>>> If I have 2 equivalent graphs, I can use them interchangeably
>>> according to the *RDF semantics*. It's not my fault.
>>>
>>>
>>>>> Or maybe, by "actual graph", you mean a graph that is actually
>>>>> "written" in
>>>>> a given dataset? Normally, the semantics defines the notion of
>>>>> interpretation independently of a given formula, and an interpretation
>>>>> makes
>>>>> true all sorts of formulas.
>>>>>
>>>>> In what I propose, for an interpretation to make a named graph true,
>>>>> the
>>>>> name has to be related (via IGEXT) to whatever graph makes the graph
>>>>> inside
>>>>> the pair true.
>>>>>
>>>> Yes, but this doesn't pick out the actual graph, just one of many
>>>> possible graphs.
>>>
>>> If you want the graph inside a <name,graph> pair, just read the
>>> dataset. There are APIs for this.
>>>
>>> Dataset d = loadDatasetFromFile(new File("foo.trig"));
>>> Graph g = d.getGraph("http://ex.com/g");
>>>
>>> That's it.
>>>
>>>>>
>>>>>> If named graphs and RDF datasets are supposed to carry a relationship
>>>>>> between a name and an actual graph, then shouldn't the semantics
>>>>>> reflect
>>>>>> this?
>>>>>
>>>>> By IGEXT, it does, but a dataset interpretation is not defined in
>>>>> function
>>>>> of a given dataset, so there is no reason that the name be associated
>>>>> with
>>>>> the "actual graph" in a given dataset.
>>>>
>>>> Why not? Isn't a major use case for RDF graphs to record where graphs
>>>> (actual graphs, not equivalence classes of graphs) come from?
>>>
>>> And what's this has to do with semantics?
>>> If I want to record someone's speach, I don't need to know the
>>> semantics of what he/she says.
>>>
>>>
>>>>>>
>>>>>> This is totally different from properties. No one should be arguing
>>>>>> that
>>>>>> RDF graphs are supposed to carry a relationship between a name and
>>>>>> a set
>>>>>> of pairs. Instead this is what the semantics does.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> (Of course, you
>>>>>>>> could always just ignore the semantics and directly use the graph
>>>>>>>> from
>>>>>>>> the dataset, but then what is the point of having the named graph
>>>>>>>> there?)
>>>>>>>
>>>>>>> The data structure is also very important, just as in RDF graphs, the
>>>>>>> data structure is already a nice way of organising the data, linking
>>>>>>> data together, etc. Semantics does not have to come into play
>>>>>>> where it
>>>>>>> has no role.
>>>>>>>
>>>>>>>
>>>>>>> --AZ
>>>>>>
>>>>>>
>>>>>> Huh? If the meaning of a named graph is tied up with relating names to
>>>>>> graphs, then the semantics certainly has a role there.
>>>>>
>>>>> Sorry, maybe I misunderstood what you were saying, but then I don't
>>>>> understand your point.
>>>>>
>>>>> What I'm saying is that, if you find a dataset somewhere in the wild,
>>>>> or if
>>>>> you have a dataset in memory, you can get the graph associated with a
>>>>> graph
>>>>> IRI by simply parsing the dataset representation. Semantics does not
>>>>> come
>>>>> into play in that case.
>>>>>
>>>>>
>>>>> --AZ
>>>>
>>>> Sure, you can look right in the dataset to find the graph, no semantics
>>>> involved. However, if RDF datasets is supposed to be able to carry some
>>>> meaning about graphs and their sources then shouldn't its semantics
>>>> actually use graphs?
>>>
>>> No, they are not supposed to carry information about the graph. That's
>>> only one use case, and we know there are people against this idea
>>> (e.g., the default as merge case). I want the common denominator that
>>> would not lead to erroneous entailments according to anybody's
>>> understanding of datasets.
>>>
>>> For metadata about graphs, provenance, dates, whatever, define a
>>> vocabulary and make sure that all applications that support the
>>> vocabulary interpret it in the same way. It has been done before and
>>> it works.
>>>
>>>
>>> AZ
>>>
>>>>
>>>> peter
>>>>
>>>>
>>>
>>
>>
>>
>
Received on Tuesday, 25 September 2012 12:35:42 UTC