Re: Draft for a "minimal dataset semantics"


Thanks for the comments.

Le 06/09/2012 13:04, Ivan Herman a écrit :
> Antoine,
> thank you.
> (I have made a tiny editorial change on the page adding a number to
> each issue, so that we can discuss it more easily.)


> executive summary from my side: this may be indeed consensus ready
> (modulo some details). I would be happy to get some sort of a
> resolution that the group refines the details of this but that is
> what will end up in the final spec.
> Some technical details/comments; to give more structure to the
> discussion, I have added my personal opinion on each of the issues
> you've explicitly added.
> - In section 3 (the Model-theoretical semantic) I guess the precise
> terminology requires us to say that V is the vocabulary of the
> dataset (come back to this later) plus whatever vocabulary the
> respective E entailment defines. Ie, if E is the OWL RDF based
> semantics, then there are quite a number of terms that are added to V
> before the definition of the interpretation function...

I was following RDF Semantics 2004 style.
Interpretations for a regime E are based on interpretation from a 
"weaker" regime E' and formulated as follows:

"An E-interpretation wrt V is an E'-interpretation wrt V union { the 
vocab of E }."

So you don't need to mention what's specific to the entailment regime, 
it's automatically included. Also, the interpretations are always wrt a 
vocabulary V, not wrt to an RDF graph.

If we want to be consistent with RDF Semantics, the way dataset 
interpretations are defined should not be a function of a dataset.

In what I wrote, a dataset can be interpreted as "true" only if the 
interpretation of the default graph is wrt a vocabulary that contains 
the graph names:

"for an IRI n and RDF graph g, I(<n,g>) is true iff IGEXT(n) is defined 
and E-entails g;"

Note "[...] iff IGEXT(n) *is defined* [...]".

> As for what is V: if I have (G,<n1,G1>,...,<nk,Gk>) then we may have
> several choices:
> 1. V = G 2. V = G ∪ { n1, n2, ..., nk } 3. V = G ∪ { n1, n2, ..., nk
> } ∪ G1  ∪ ...  ∪ Gk
> My instinct says that we should go for #2. Note that for the
> alternative you describe in Issue 6 on the domain of IGEXT to be
> valid, either #2 or #3 should be chosen.

Currently, the semantics requires 2 but not 3.

> - Issue 1:
> Technically, this may be useful but it would probably made the
> semantics (though marginally) more complex. Actually, the alternative
> we also explored is where each named graph may have a different
> entailment regime attached. I am not sure we could get consensus on
> this, the complexity is a bit off putting.
> For the sake of simplicity and moving forward, we should probably go
> with the current approach, ie, one entailment regime to rule them
> all...

I agree.

> - Issue 2:
> I think this depends on Issue 1. If Issue 1 allows for different
> entailments between the graphs and the default graph, then the 'no
> entailment' makes sense to simplify the semantic formalism; it would
> make it indeed possible to have a semantics whereby some entailment
> is done, say, on the default graph, whereas the individual graphs are
> treated as some sort of a black boxes with no entailment at all.
> (Incidentally, this is what we meant as 'quoting semantics' in the
> document we put forward a few weeks ago.) But if Issue 1 is voted to
> keep simplicity, ie, one entailment for all, then Issue 2 is, in my
> view, moot.


> - Issue 3:
> We did discuss this and had a proposal (as you note in the text).
> However, I do not see any consensus coming on this in the group.
> Besides, no such syntax exists right now, at least in terms of the
> core RDF standards, for entailments in general, regardless of named
> graphs. So probably the answer should be 'no'.

I think such a possibility would solve a number of problems (especially, 
being able to explicitly say that the graphs are "quoted") but I agree 
it's unlikely that a consensus emerges in the remaining time.

> B.t.w., here again, this Issue really makes sense only if Issue 1 is
> voted for a more complex approach. If not, then current practice
> definitely dictates a 'no' answer.


> (That being said, we may define such a vocabulary in a W3C note if
> Issue 1 is voted as 'yes'. But that is besides the point for the
> current, rec track discussion.)
> - Issue 4:
> I think this is the same as Issue 6. See below for my vote

Yes. I repeat it because some people may overlook the model-theoretic 

> - Issue 5
> I am not sure what I/we meant by 'quoting semantics' is exactly the
> same as what you describe there and, honestly, I am not even sure I
> understand what you write:-( But see also my comment on Issue 2.

It's not exactly syaing what it should be saying. I simply have trouble 
thinking of the act of "quoting" as something that removes the 
semantics. It does not work like that in human interaction. Quoting is 
meta knowledge about a statement. The statement, quoted or not, has its 
own meaning, but the quoting adds the information that the words are 
exactly those used by the issuer.

The "quote semantics" has very little to do with quoting. It's only 
saying "here is a set of triples, but be careful, it's not really an RDF 
graph, it just looks like it but bnodes are not existential variables, 
literals are just yet another way to identify a resource". It does not 
even have to be reproducing information from any source, therefore it's 
not quoting anything. It's just presenting a data structure which happen 
to have the same syntax as an RDF graph but does not mean what it is 
normally (and normatively) defined to mean.

So, I can call it "quote semantics" because I don't want to open another 
terminology war, but I think it's a very bad name for it. That's why I 
used "no-semantics" in the wiki.

> - Issue 6
> I think the distinction between what was called IRI-GEXT and RES-GEXT
> is fairly minor in practice. Richard had some good arguments in
> favour of RES-GEXT; let me add my aesthetic argument: formalizing
> IGEXT having the resources as a domain and not the URI-s (ie, going
> the RES-GEXT) is also in line with the way properties and classes are
> modeled in the current RDF semantics. My vote would therefore go to
> change the semantics in the way you describe in Issue 6 to ensure a
> more consistent view of the world.

I wrote the /minimal/ minimal dataset semantics :) RES-GEXT is derived 
from IRI-GEXT using a simple semantic constraint. For all constraints 
that we add, we'd better have (near) consensus.

I'm still uncertain that the principles of this proposal have already 
consensus. Especially, I don't know Sandro's opinion about it. Was Pat 
in agreement with it?


> Thanks again!
> Ivan
> On Sep 5, 2012, at 16:56 , Antoine Zimmermann wrote:
>> Dear all,
>> Based on the recent discussions on dataset semantics, which seemed
>> to be rather fruitful, I made a first attempt to write down the
>> latest ideas, as David suggested me to do, in order to have a basis
>> for discussion in our telecon.
I've put a short informal introduction as well as the model-theoretic 
>> I also recorded issues that may have to be solved and can affect
>> the semantics.
>> This draft, at the moment, does not refer to the use cases. It only
>> describes the semantics itself. It will be improved.
>> Best, -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol
>> École Nationale Supérieure des Mines de Saint-Étienne 158 cours
>> Fauriel 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03
>> Fax:+33(0)4 77 42 66 66
> ---- Ivan Herman, W3C Semantic Web Activity Lead Home:
> mobile: +31-641044153 FOAF:

Antoine Zimmermann
ISCOD / LSTI - Institut Henri Fayol
École Nationale Supérieure des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
Tél:+33(0)4 77 42 66 03
Fax:+33(0)4 77 42 66 66

Received on Thursday, 6 September 2012 13:34:34 UTC