Re: Minimal dataset semantics

Le 27/08/2012 11:34, Richard Cyganiak a écrit :
> Antoine,
> On 24 Aug 2012, at 17:15, Antoine Zimmermann wrote:
>> However, we could as well make the graph extension be a function
>> from IRIs to RDF Graphs (let us call it IRI-GEXT) instead of a
>> function from resources to RDF Graphs (let us call it RES-GEXT).
>> With RES-GEXT, the following:
>> <ex:bob>  owl:sameAs  <eg:bob> . <ex:bob> { <ex:bob> <name> "Robert
>> Doe" . <x>  owl:sameAs  <y> } <eg:bob> { <eg:bob> <name> "Robert
>> Doe" . <x>  owl:differentFrom  <y> }
>> is inconsistent. The problem is, perhaps ex:bob and eg:bob really
>> refer to the same thing, and the dataset is using the scheme "graph
>> IRI denotes primary topic" which we agreed is an acceptable scheme.
>> Yet, here, the semantics with RES-GEXT does not allow to separate
>> the inferences of the two graphs. This problem does not occur with
> I see this as a fairly minor distinction.
> Yes, if you use the “graph IRI denotes primary topic” convention and
> put “owl:sameAs” triples into the default graph, then the two graphs
> will be semantically merged and you cannot keep their inferences
> apart. I can live with that.

I think I can live with that too. If the semantics is such, I can 
overcome these situations by being careful about the IRIs I use to 
"name" graphs. But implementers should be conscious of this situation.

> I prefer RES-GEXT because it is clearer about what the graph IRIs
> denote.

I don't think so. The graph IRI can denote anything, nothing is told 
about it, in neither RES-GEXT nor IRI-GEXT. The difference is that 
RES-GEXT says something about how the thing denoted by the IRI relates 
to the graph. In IRI-GEXT, they have simply no particular relationship.

> IRI-GEXT doesn't tell us anything about what the graph IRIs
> denote. RES-GEXT says that the graph IRIs denote some entity with a
> graph extension.

Which does not say anything about the entity.

> This would make it more natural to say (in semantic
> extensions, not in core RDF Semantics) things like: “The graph
> extension of any RDF document published on the web is the RDF graph
> encoded in the document.” Or: “An RDF graph is itself a resource, and
> it has itself as its graph extension.”

"The IRI-GEXT graph extension of an IRI that dereferences with a 200 OK 
must be the RDF graph obtained by parsing the Web resource found there."

> These things *can* be said using IRI-GEXT too, but require one more
> step of indirection (at least with the mental model I currently have
> in my head.)

I'm not convinced of this. In the previous example, you have to know 
that an IRI denotes an RDF document. How do you do that? Formally, with 
the RDF semantics (and even with the OWL semantics) you have no way to 
find out. You have to use httpRange14's magic to figure out, which is 
not part of RDF. Moreover, given an IRI that does not give you a 200 OK, 
it's possible that with extra information (e.g., OWL reasoning), you 
conclude that it's the same as an RDF document. With IRI-GEXT, the 
relation between the IRI and the RDF Graph can simply be constraint to 
be the same, direct relationship between an IRI and what it dereferences 
to. It seems to me that RES-GEXT introduces one more step of indirection 
in this particular case.

> (This discussion is a nice example for the subtle differences that
> arise from the choice of formalisation, and why saying nothing about
> dataset semantics might end up being a bad idea: Different users of
> datasets will make their own and often incompatible assumptions.)

Yes and no. I am myself convinced that we need to formalise at least a 
minimal semantics. But, playing the devil's lawyer, many things do not 
have a /formal/ semantics and still work. Consider the DC vocabulary. 
Formally, most of the terms are just properties with a label and textual 
description. Their semantics is as little defined (formally) as 
<>. People exchange documents using DC properties 
where the creator and the consumer have different assumptions about the 
terms. Yet, it mostly works pretty well.

>> here is a possible formalisation of this. Let us assume an
>> entailment regime E. A dataset-interpretation is an
>> E-interpretation I plus a function GEXT from the set of IRIs to the
>> set of RDF Graphs. For a dataset D = (DG,(n1,G1),...,(nk,Gk)), -
>> for an IRI n and a graph G, I(n,G) is true iff there exists a graph
>> G' such that (n,G') is in GEXT and G' E-entails G;
> I guess for RES-GEXT, this line would be changed to: ...such that
> (I(n),G') is in GEXT... ?


>> - if I(G) is false, then I(D) is false;
> I think you mean: if I(DG) is false...?

Yes, typo.

>> - if I(ni,Gi) is false for some i, then I(D) is false; - I(D) is
>> true otherwise.
>> Nice, concise, and achieve essentially the same as my initial
>> dataset semantics.
> +1
> Best, Richard


Antoine Zimmermann
ISCOD / LSTI - Institut Henri Fayol
École Nationale Supérieure des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
Tél:+33(0)4 77 42 83 36
Fax:+33(0)4 77 42 66 66

Received on Monday, 27 August 2012 18:14:03 UTC