Re: Reconciliation of concerns, re islands and dataset semantics? from Antoine Zimmermann on 2012-03-01 (public-rdf-wg@w3.org from March 2012)

From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Date: Thu, 01 Mar 2012 12:42:39 +0100
To: Ivan Herman <ivan@w3.org>
CC: Pat Hayes <phayes@ihmc.us>, W3C RDF WG <public-rdf-wg@w3.org>
Message-ID: <4F4F60AF.8020101@emse.fr>
Ivan,


Glad to see you join this verbal joust ;)
I hope that your fresh eye can help reaching consensus.


Le 01/03/2012 10:27, Ivan Herman a écrit :
> Guys,
>
> I do not claim to have read and understood all the details of your
> discussion. I hope I have grasped the essential aspects of it,
> though. And I was wondering whether it was possible to reconcile
> Antoine's proposal (that do have some attraction for my engineering
> mind, I must admit) and Pat's objections. Let me try to give an
> alternative to Antoine's definition, and see if that works.
>
> An interpretation 'I' of a Graph is a function which abides to
> certain semantic conditions.

What do you mean exactly?  is I an RDF interpretation (over a vocabulary V)?


  If there is a graph 'G', it is therefore
> possible to talk about 'I|G', ie, the restriction of the function to
> 'G'.

Do you mean, a restriction of the function to the vocabulary of G? 
Otherwise, I don't understand what you mean by that.


  With that in mind:
>
> Let D = (G, (<n1>,G1), (<n2>,G2), ... , (<nk>,Gk))
>
> and let 'I' be a mapping, ie, and interpretation, from
> 'union(G,G2,G2,...,Gk)' such that, 'I|G', 'I|G1', ... , 'I|Gk' are
> all RDF/RDFS/OWL Full etc. interpretations (ie, they abide to the
> semantic condition locally).

I'm lost: union(G,G1, ...) means the union of the triples in all these 
graphs?

Do you mean that I is an RDF interpretation of the union of all the 
vocabularies used in G, G1, etc., and I|G is the restriction of I to the 
vocabulary of G? Remember that (as Pat recalled already) an RDF 
interpretation of a graph or a triple is fully determined by the 
interpretation of a vocabulary.

If such is your proposal, then it's viable in the sense that it can work 
as a logic, but I'm wondering how this would solve the 
endorsement/beliefs/provenance/temporal use cases.

>
> If
>
> E =  (H, (<m1>,H1), (<m2>,H2), ... , (<mk>,Hk))
>
> is another dataset, than we can say that 'D' entails 'E' if for all
> interpretations 'I' of 'D',
>
> 'I|H', 'I|H1', ... , 'I|Hk'
>
> are all interpretations, which seems to be the same as saying that
> 'I' is also an interpretation of 'E'.

I guess here you mean "model" rather than interpretation (i.e., an 
interpretation that satisfies the dataset). Said otherwise, entailment 
is defined "as usual":

D entails E iff all models of D are models of E.

(Pat just reminded me in his last email that I do not use the "usual" 
definition of entailment in the dataset proposal, but I'm gonna fix this).


> This means that there is level of consistency shared by all graphs in
> a dataset, ie, if a resource 'R' is in 'G1' and 'G2', then we are
> sure that an interpretation maps it identically, because 'I' is
> defined as a mapping on the *union* of all graphs. But the semantic
> conditions, as well as the entailments, are restricted to the
> individual graphs.
>
> Can that work?

This certainly works as a logic, but the question is whether it 
addresses the use cases dealing with multiple graphs.
Take the case when I want to have the following information:

:x thinks that { :a owl:sameAs :b }
:y thinks that { :a owl:differentFrom :b }

which could be reformulated in :x endorses the first graph, :y endorses 
the other graph, or the document at :x contains the first graph, the one 
at :y contains the second, or again, a SPARQL dataset contains the two 
graphs respectively "named" :x and :y.

There is going to be a problem here. Yet, the information that I want to 
encode is perfectly reasonable: I want to express the fact that :x and 
:y disagree, which is a coherent information.


> This approach, if again it works, seem to be a much less radical
> change, both conceptually and practically, to the current RDF
> standards. Therein lies, for me, its attraction...

It depends. If, like me, you think that dataset should be a separate 
data structure, distinct from RDF (not a generalisation of RDF, just a 
new construction on top of RDF, just like RDF is a new construction on 
top of URIs), then the change is certainly as radical (as little 
radical) as the change introduced by the dataset semantics.

To implement it, one just needs a reasoner and apply it separately on 
each RDF graph in the dataset. Really trivial. I'm sure some triple 
stores already do that.
If you want to programme the "default-is-universal-truth" extension, you 
just apply the reasoner separately on union(G,Gi). Again, trivial.
If you want to programme the "default-as-merge" extension, you just 
compute the merge and apply reasoning separately. Trivial.

Now, if you want to do temporal reasoning, provenance, trust, it's more 
complicated. But the fierceful rejection by Pat on the mere idea of a 
multi-interpretation semantics has deviated the discussion away from 
these issues.

Whagt is not trivial, however, is how we will allow users to express 
what semantic extension they are using. I think that Sandro's design 
solutions are going in the right direction regarding this.

>
> Two more things.
>
> 1. A variant, that I touched upon in a mail to Antoine yesterday
> evening, is to say that 'I' must be interpretations of
> 'I|union(G,G1)', etc. Ie, 'G', the default graph, contains
> "universal" truths. As Antoine says, this may be a parametrization of
> the dataset itself, because this may be too much to ask in some
> cases. To be seen (for me, I must say, this universal truth approach
> sounds more natural)
>
> 2. I would note that neither this approach nor Antoine's, nor, in
> fact (I believe) Pat's quad approach touches upon the other
> discussion on named graphs that we are having, namely the exact
> relationships between '<ni>' and 'Gi'. In these semantic approaches
> the uri-s merely serve as classifiers. Whether we have some GET
> Semantics or SameAs semantics or whatever attached to'<ni>' and it
> corresponding 'Gi' is to be defined separately. Semantically, what
> this means that the default graph, ie, 'G', may include some
> additional
>
> (ni, rdf:type, rdf:GETSemanticsClass)
>
> triples, but where the 'semantics' will not be expressed in terms of
> semantic conditions. Which I think is fine with me, just wanted to
> make it clear and it may be worth separating the two discussions in
> the group explicitly...

True, but this is difficult to put in the formal semantics as model 
theory is only interested in what is true from a given logical theory. 
It does not normally deal with behaviour, what an application should 
*do*. That is why owl:imports does not have a particularly constraining 
model theoretic semantics. The mechanism behind owl:imports is defined 
outside the semantic documents.

Similarly, we can very well define mechanisms that have no 
representation in terms of model theory.


> Now is my turn to be torn apart by Pat:-)

Good luck ;)


AZ

>
> Ivan
>
> ---- Ivan Herman, W3C Semantic Web Activity Lead Home:
> http://www.w3.org/People/Ivan/ mobile: +31-641044153 FOAF:
> http://www.ivan-herman.net/foaf.rdf
>
>
>
>
>

-- 
Antoine Zimmermann
ISCOD / LSTI - Institut Henri Fayol
École Nationale Supérieure des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 83 36
Fax:+33(0)4 77 42 66 66
http://zimmer.aprilfoolsreview.com/
Received on Thursday, 1 March 2012 11:43:02 UTC