Re: dataset semantics from Antoine Zimmermann on 2011-12-20 (public-rdf-wg@w3.org from December 2011)

From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Date: Tue, 20 Dec 2011 14:32:40 +0100
To: Pat Hayes <phayes@ihmc.us>
CC: Richard Cyganiak <richard@cyganiak.de>, public-rdf-wg@w3.org
Message-ID: <4EF08E78.4090602@emse.fr>
Le 20/12/2011 02:01, Pat Hayes a écrit :
>
> On Dec 19, 2011, at 1:50 PM, Richard Cyganiak wrote:
>
>> On 19 Dec 2011, at 10:48, Pat Hayes wrote:
>>> I would like to see some evidence, from actual use cases, of how
>>> it can be that different RDF graphs hold in different contexts,
>>
>> See here for some (toy) examples:
>
> Thanks for the pointers. I will spend this evening reading :-)
>
>> ....
>>> How can information from different contexts be used together?
>>
>> It generally cannot, unless you have some extra information (e.g.,
>> provenance metadata) that establishes sufficient confidence in the
>> information for the purpose of the application. Some application
>> may be ok with merging absolutely anything. Others may only rely on
>> information from a fixed set of providers (e.g., from URIs with a
>> certain hostname). Many other approaches are possible.
>
> Ah, that is not exactly what I meant. You are here talking about
> degree of trust: when do I, a consumer of RDF data culled from the
> Web, decide to trust it well enough to draw conclusions from it. This
> is important, of course, but the issue that we have here with
> 'contexts' has to do with a different topic, viz. the context
> *changing the meaning* of IRIs. That is, a given IRI in one 'context'
> might refer to something different from what it refers to in another
> 'context'. This is the characteristic property of so-called 'context
> logics', by the way, and it is where Antoine's multi-interpretation
> semantics is headed.
>
> Is this true for RDF on the (wild) Web? If so, how is RDF from
> different contexts (in this second sense) used together? It would
> seem that some kind of name-separation technique would be needed
> before such RDF graphs could be merged or even used in the same
> context (?)

I think the discussions that we had in this working group show that 
there isn't a single way of combining Web data. This justifies that, 
*if* we provide a semantic at all to datasets, then it has to be very 
little constrained. In any case, whether there is a "contextual 
semantics" or not, people can still merge RDF graphs if they want, and 
there are chances that it will remain a common way of integrate data 
from multiple sources.

However, I gave the example of Sindice, which uses the context to reason 
over Web data. I can also mention SAOR, another Semantic Web search 
engine which uses the context in its reasoning mechanism.

Those are examples that came from the top of my head but I'm sure 
Richard has other examples.

Having the notion of dataset, and having a semantics for that notion, do 
not force people to use that notion. RDF still exists together with that 
notion, and operations on RDF graphs can be used.


AZ.


> Pat
>
>
>>
>> See here for Sandro's writeup of some of these issues:
>> http://lists.w3.org/Archives/Public/public-rdf-prov/2011Sep/0003.html
>>
>>
>>
Best,
>> Richard
>>
>>
>>>
>>> Pat
>>>
>>>
>>>
>>> On Dec 19, 2011, at 4:06 AM, Antoine Zimmermann wrote:
>>>
>>>> Just wanted to reiterate, there is a dataset semantics at [1]
>>>> which has been there since about March 2011. In spite of the
>>>> math symbols all over the place, it's really simple. The
>>>> rationale was to make it according to the least common
>>>> denominator, such that it does not put constraints that some
>>>> people would like to relax later on. Adding constraints can be
>>>> done easily on a conformant implementation, while removing
>>>> constraints make the implementation non-compliant.
>>>>
>>>> Note that this semantics does not change the semantics of RDF,
>>>> as it is separated from it, though relying on it.
>>>>
>>>>
>>>> [1] TF-Graphs/RDF-Datasets-Proposal, Section "Semantics".
>>>> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal#Semantics.
>>>>
>>>>
>>>>
>>>>
>>>>
Le 17/12/2011 06:43, Sandro Hawke a écrit :
>>>>> On Fri, 2011-12-16 at 22:47 -0600, Pat Hayes wrote:
>>>>>> On Dec 16, 2011, at 10:21 PM, Sandro Hawke wrote:
>>>>>>
>>>>>>> ... maybe I can figure out some TriG entailment
>>>>>>> tests.... Like, does this TriG document / dataset:
>>>>>>>
>>>>>>> {<a>   <b>   <c>   }
>>>>>>>
>>>>>>> entail this RDF graph:
>>>>>>>
>>>>>>> <a>   <b>   <c>.
>>>>>>>
>>>>>>> I think it should, so we can have metadata in TriG, but
>>>>>>> other people have disagreed.   How should we be gather
>>>>>>> test cases like this?
>>>>>>
>>>>>>
>>>>>> FWIW, 'entailment' has a fairly precise meaning. A entails
>>>>>> B when B is true whenever A is, or more precisely if, for
>>>>>> every possible interpretation I, if A is true in I then B
>>>>>> is true in I. So it only makes sense to speak of entailment
>>>>>> when there is some notion of truth-in-an-interpretation to
>>>>>> base it on.
>>>>>
>>>>> Yes, I know.
>>>>>
>>>>>> So, what are the truth conditions for datasets?
>>>>>
>>>>> We haven't quite figured that out yet.   I'm proposing one
>>>>> part of that is that a dataset being true implies its default
>>>>> graph is true.
>>>>>
>>>>> The other part of the truth conditions has to do with the
>>>>> relationship between the things named by the label URIs and
>>>>> the graphs they label.
>>>>>
>>>>> Unfortunately, I think we need to allow for several possible
>>>>> relationships there, MAYBE even in the same dataset, which
>>>>> makes things rather complicated.
>>>>>
>>>>> One example of the relationship is what I called graphState
>>>>> in a different thread.  In that case, the dataset being true
>>>>> would imply that for each<U,G>   in the dataset, the state of
>>>>> the resource U is the graph G.   (Here, I mean "state" and
>>>>> "resource" in exactly the REST sense.)
>>>>>
>>>>> Another example is an out of date version of graphState,
>>>>> maybe call it graphStateWas.  In this case, the dataset being
>>>>> true would imply that for each<U,G>   in the dataset, the
>>>>> state of the resource U is, or used to be, graph G.
>>>>>
>>>>> Another example of the relationship is something I gather
>>>>> Cambridge Semantics uses, which I'll call subjectOf.   (In
>>>>> one of their deployment modes, triples are divided into two
>>>>> type, which I'll call A and B, based on which predicate they
>>>>> use.  The dataset is constructed such that for each<U, G>
>>>>> in the dataset, every type-A triple in G is of the form {<U>
>>>>> ?P ?O }.  The type-B triples are a little more complicated.)
>>>>> In this case, the dataset being true would imply the dataset
>>>>> being segmented in this complicated but useful way.
>>>>>
>>>>> It's *rather* tempting to just use triples for this, making
>>>>> graphState, graphStateWas, subjectOf, etc, be predicates.
>>>>> That way the semantics of datasets would be much simpler,
>>>>> with the complications bundled into the semantics of those
>>>>> particular predicates.
>>>>>
>>>>> I'm guess I'm suggesting extending the definition of dataset
>>>>> to be a default graph and rather than a set of pairs<U,G>, be
>>>>> a set of triples<U, R, G>, where R is optional.  If R is
>>>>> omitted, you have the kind of dataset we're used to now,
>>>>> where we have no idea what that relation is supposed to be
>>>>> (unless the author tells us humans).
>>>>>
>>>>>> Can one assert a dataset (ie claim it to be true)?
>>>>>
>>>>> Yes.
>>>>>
>>>>>> How does one do that?
>>>>>
>>>>> The same way you do with RDF.  It kind of depends on your
>>>>> application. Maybe you publish it on the web; maybe you send
>>>>> it to some agent; maybe you publish it and send the URL
>>>>> somewhere, etc.
>>>>>
>>>>> -- Sandro
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École
>>>> Nationale Supérieure des Mines de Saint-Étienne 158 cours
>>>> Fauriel 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 83
>>>> 36 Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/
>>>>
>>>>
>>>
>>> ------------------------------------------------------------ IHMC
>>> (850)434 8903 or (650)494 3973 40 South Alcaniz St.
>>> (850)202 4416   office Pensacola
>>> (850)202 4440   fax FL 32502
>>> (850)291 0667   mobile phayesAT-SIGNihmc.us
>>> http://www.ihmc.us/users/phayes
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
> ------------------------------------------------------------ IHMC
> (850)434 8903 or (650)494 3973 40 South Alcaniz St.
> (850)202 4416   office Pensacola                            (850)202
> 4440   fax FL 32502                              (850)291 0667
> mobile phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>
>
>
>
>
>

-- 
Antoine Zimmermann
ISCOD / LSTI - Institut Henri Fayol
École Nationale Supérieure des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 83 36
Fax:+33(0)4 77 42 66 66
http://zimmer.aprilfoolsreview.com/
Received on Tuesday, 20 December 2011 13:32:51 UTC