W3C home > Mailing lists > Public > public-rdf-wg@w3.org > December 2011

Re: dataset semantics

From: Ivan Herman <ivan@w3.org>
Date: Mon, 19 Dec 2011 11:55:06 +0100
Cc: public-rdf-wg@w3.org
Message-Id: <162DB70F-AA33-4422-9DAA-B02C24B52475@w3.org>
To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
If we follow what I proposed a few days ago, ie, separating the concepts of datasets and named graphs, then, at least in my view, there is no need for any semantics on datasets whatsoever. It is a collections of labelled graphs. That is it.

The discussions on gettable URI-s etc, give a line for semantics for named graphs which seems to be fine with me _if_ the application wants to go beyond simple labeling. Pat will probably say I am wrong:-) but I am not sure it is worth going beyond that level of semantics (ie, defining requirements on how the label behaves over HTTP GET). Ie, I am not even sure the RDF Semantics document would be affected (except if we want to add some additional properties around named graphs, but that is an addition to the current semantics, not a change)


On Dec 19, 2011, at 11:06 , Antoine Zimmermann wrote:

> Just wanted to reiterate, there is a dataset semantics at [1] which has
> been there since about March 2011. In spite of the math symbols all over
> the place, it's really simple. The rationale was to make it according to the least common denominator, such that it does not put constraints that some people would like to relax later on. Adding constraints can be done easily on a conformant implementation, while removing constraints make the implementation non-compliant.
> Note that this semantics does not change the semantics of RDF, as it is separated from it, though relying on it.
> [1] TF-Graphs/RDF-Datasets-Proposal, Section "Semantics". http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal#Semantics.
> Le 17/12/2011 06:43, Sandro Hawke a écrit :
>> On Fri, 2011-12-16 at 22:47 -0600, Pat Hayes wrote:
>>> On Dec 16, 2011, at 10:21 PM, Sandro Hawke wrote:
>>>> ... maybe I can figure out some TriG entailment tests....
>>>> Like, does this TriG document / dataset:
>>>> {<a>  <b>  <c>  }
>>>> entail this RDF graph:
>>>> <a>  <b>  <c>.
>>>> I think it should, so we can have metadata in TriG, but other
>>>> people have disagreed.   How should we be gather test cases like
>>>> this?
>>> FWIW, 'entailment' has a fairly precise meaning. A entails B when B
>>> is true whenever A is, or more precisely if, for every possible
>>> interpretation I, if A is true in I then B is true in I. So it only
>>> makes sense to speak of entailment when there is some notion of
>>> truth-in-an-interpretation to base it on.
>> Yes, I know.
>>> So, what are the truth conditions for datasets?
>> We haven't quite figured that out yet.   I'm proposing one part of
>> that is that a dataset being true implies its default graph is true.
>> The other part of the truth conditions has to do with the
>> relationship between the things named by the label URIs and the
>> graphs they label.
>> Unfortunately, I think we need to allow for several possible
>> relationships there, MAYBE even in the same dataset, which makes
>> things rather complicated.
>> One example of the relationship is what I called graphState in a
>> different thread.  In that case, the dataset being true would imply
>> that for each<U,G>  in the dataset, the state of the resource U is
>> the graph G.   (Here, I mean "state" and "resource" in exactly the
>> REST sense.)
>> Another example is an out of date version of graphState, maybe call
>> it graphStateWas.  In this case, the dataset being true would imply
>> that for each<U,G>  in the dataset, the state of the resource U is,
>> or used to be, graph G.
>> Another example of the relationship is something I gather Cambridge
>> Semantics uses, which I'll call subjectOf.   (In one of their
>> deployment modes, triples are divided into two type, which I'll call
>> A and B, based on which predicate they use.  The dataset is
>> constructed such that for each<U, G>  in the dataset, every type-A
>> triple in G is of the form {<U>  ?P ?O }.  The type-B triples are a
>> little more complicated.)  In this case, the dataset being true would
>> imply the dataset being segmented in this complicated but useful
>> way.
>> It's *rather* tempting to just use triples for this, making
>> graphState, graphStateWas, subjectOf, etc, be predicates.   That way
>> the semantics of datasets would be much simpler, with the
>> complications bundled into the semantics of those particular
>> predicates.
>> I'm guess I'm suggesting extending the definition of dataset to be a
>> default graph and rather than a set of pairs<U,G>, be a set of
>> triples <U, R, G>, where R is optional.  If R is omitted, you have
>> the kind of dataset we're used to now, where we have no idea what
>> that relation is supposed to be (unless the author tells us humans).
>>> Can one assert a dataset (ie claim it to be true)?
>> Yes.
>>> How does one do that?
>> The same way you do with RDF.  It kind of depends on your
>> application. Maybe you publish it on the web; maybe you send it to
>> some agent; maybe you publish it and send the URL somewhere, etc.
>> -- Sandro
> -- 
> Antoine Zimmermann
> ISCOD / LSTI - Institut Henri Fayol
> École Nationale Supérieure des Mines de Saint-Étienne
> 158 cours Fauriel
> 42023 Saint-Étienne Cedex 2
> France
> Tél:+33(0)4 77 42 83 36
> Fax:+33(0)4 77 42 66 66
> http://zimmer.aprilfoolsreview.com/

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Monday, 19 December 2011 10:55:26 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 22:02:02 UTC