Re: attempts to reconciliate quote-semantics and "context"-semantics (Was: Re: RDF dataset semantics again) from Antoine Zimmermann on 2012-08-23 (public-rdf-wg@w3.org from August 2012)

From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Date: Thu, 23 Aug 2012 17:32:00 +0200
To: public-rdf-wg@w3.org
Message-ID: <50364CF0.3080301@emse.fr>
Le 23/08/2012 17:02, Ivan Herman a écrit :
>
> On Aug 23, 2012, at 16:20 , Antoine Zimmermann wrote:
>>
>> If it is not applied to the default graph, then what semantics
>> apply to it?
>>
>
> My instinct would say that, by default, Simple Entailment would apply
> to the Default Graph; that would be, I believe, the reflection of
> quoting.

I think this is reasonable.


> [snip]
>
>>
>>
>>> _Personally_, I like where this is going, although I am not sure
>>> it would cover all the use cases we had in the past. For example,
>>> one of the use case that did come up is what we used to call
>>> 'merge semantics', meaning that all inferences and constraints
>>> checks are made on the merge of all the graphs (including the
>>> default graph). This is not covered by this structure but, maybe,
>>> this is not important.
>>
>> Yes, it does not cover the merge/union semantics. But they are
>> simple semantic extensions, for which we can provide (or a future
>> WG) vocabulary to specify.
>>
>
> would that be a semantic extension of the scheme you propose, or
> something totally orthogonal to it? I do not see how it fits in your
> scheme; that one does not speak about the merging of the particular
> graphs...

I agree, and I don't know how we should handle it at the moment.

>>
>>> (I am also a little bit worried about the complexity for users.
>>> We shall see the feedbacks.)
>>
>> The formal semantics will certainly look quite complex, but it can
>> be explained in a rather simple way. In a nutshell, it says that:
>>
>> "All RDF graphs in an RDF dataset don't mean the same thing. To be
>> explicit about what they mean, we provide a vocabulary that specify
>> the semantics of each graph. We call the semantics assigned to
>> a <name,graph>  pair its entailment regime, because it determines
>> what entailments are valid for that pair. For example ..."
>>
>> Then we have to warn the people who would like to read the formal
>> details should be well armed.
>>
>
> :-) Sounds familiar...
>
>>
>>
>> Yes, that was the idea when I wrote it. The idea of having distinct
>> regimes for distinct<name,graph>  pairs only made sense to me when
>> I drafted the new proposal yesterday.
>>
>> This is why I marked the following in the new proposal:
>>
>> "TODO: modify the semantics such that different entailment regimes
>> can be used for different named graphs."
>>
>>
>
> Oops, sorry, I missed that.

>
>
>>> [[[ Moreover, dataset interpretations are defined with respect to
>>> an entailment regime E, as defined in SPARQL 1.1 Entailment
>>> Regimes. Let KE be the set of all interpretations as defined in
>>> SPARQL 1.1 plus the No-Semantics. The interpretation of an RDF
>>> Dataset (G, (<n1>,Gn1), ..., (<nk>,Gnk)) over vocabulary V is a
>>> pair (I,Con) where I is an E-interpretation of G (the default
>>> graph), and Con is a mapping from V to KE.
>>>
>>> A dataset-interpretation (I,Con) of a vocabulary V wrt
>>> entailment regime E-satisfies an RDF Dataset (G, (<n1>,Gn1), ...,
>>> (<nk>,Gnk)) iff I E-satisfies G, and for all i in [1..k], Con(ni)
>>> exists and Con(ni)-satisfies Gni. ]]]
>>>
>>> (Note that last 'Con(ni)-satisfies Gni'.)
>>
>> Con(ni) assigns a term of the vocabulary to an E-interpretation. It
>> does not assign a term to an entailment regime. This is something I
>> have to think about, but I have an idea. It's not trivial, as far
>> as I can see.
>>
>
> I am not sure I understand. Why can't Con(ni) directly assign a term
> to an entailment regime? As far as I can see, that is the only place
> it is used, so we can define it as we wish...

Con(ni) is used to tell what is the interpretation of the graph labelled 
ni. With the current formalisation, it must be an E-interpretation, with 
E fixed globally for the dataset. But yes, maybe the solution is to map 
Con(ni) to a pair (E,I) where E is an entailment regime, and I is an 
E-interpretation. Good idea, thanks Ivan!

Then, each <name,graph> pair has its own logic.

>>
>>> Terminologically and in line with the outline you had, I
>>> actually o.k. to call that an 'E-interpretation of the dataset';
>>> after all, each entailment regime is parametrized, wether it
>>> applies to the default graph or not.
>>>
>>> 2. If I choose to use, say, OWL as an entailment regime for Gi,
>>> how do I specify *which* OWL Ontology should be used for that
>>> purpose? I know this is also a question one may about current
>>> graphs, too, but the situation becomes a bit more complex if we
>>> have many different graphs. Would we require some sort of a
>>> follow-your-nose on the terms used in Gi and allow the system to
>>> find the relevant ontologies? Or would we suppose that, at least
>>> conceptually, Gi has all the ontologies as part of the graph
>>> already, and we decide that this is not our problem?
>>
>> Entailment regimes are special semantic conditions that are hard
>> coded in consensual standards. We don't define an entailment regime
>> per-ontology.
>
> Right.
>
>> If you want the knowledge of the ontology to influence the
>> inference on a named graph, then you can put the ontology inside
>> the graph.
>
> Yes, that is one of the options I raised. And, conceptually, that can
> work, this is how the current SW works, after all: the ontology used
> for entailment of a particular graph is really 'paired' out of band.
>
> My only worry is that, in practice, if we have datasets with very
> different entailment regimes, this may become very complicated for a
> deployment, unless we provide some standard (though probably
> optional) ways to do it.

Yes, it's perhaps the reason why my initial proposal has a single 
entailment regime for the whole dataset. Nonetheless, SPARQL allows 
distinct entailment regimes to be implemented by the same endpoint. The 
SPARQL 1.1 Service Description vocabulary allows one to specify that a 
SPARQL endpoint uses different regimes for different named graphs. I 
find the work done in SPARQL really inspiring and I like when something 
in RDF aligns well with something from SPARQL.

With a per graph entailment regime declaration, we could dump the data 
of a SPARQL endpoint /together with/ the regimes the endpoint uses.


>> But I understand that it would be more convinient to be able to
>> refer to an external document (whether it is OWL or plain RDF).
>> Something like an "import". Truely useful but I doubt we should
>> take care of this in this WG. Let us leave some work to do for the
>> future generations :)
>>
>
> I would not close the door on that if this whole semantic line is
> going in a satisfactory direction.
>
>>
>>> For RIF, SPARQL/RIF introduced the rif:usedWithProfile. Maybe
>>> something similar, but also for OWL, should be added to those
>>> parameters that you defined.
>>
>> Maybe. Let us see first how people react to this proposal.
>>
>
> Sure. There may be huge semantic pitfalls that I certainly cannot
> foresee, I would never claim to have an expertise in the semantic
> side of things.
>
> One issue I see but I cannot properly formulate is the question
> whether, in an (ni,Gi) pair, we can or we cannot really rely that the
> 'ni' really really behaves like a URI which uniquely identifies a
> Graph. Isn't it possible that two different applications may reuse
> 'ni' to signify two different graphs. 'Cause if so, then the
> entailment statements (which are RDF statements somewhere in the
> wild) will, theoretically, lead to problems.

This problem isn't specific to datasets. Nothing in RDF prevent someone 
to reuse a URI for a different thing than what it was originally 
designed for.
But if we keep triples as the main data model for publishing data on the 
Web, and rely on datasets mainly for an exchange data model between data 
management systems, then the problem is somewhat weakened. Then it is 
the responsibility of the consumers to not consume datasets found in the 
wild.
For instance, I envision that semantic web browser will not parse TriG 
files. They may build a dataset in memory to keep RDF data from 
different sources separated, but this will happen in the background.


>
> But maybe this is one of those things we will have to live with.
>
>>
>>> 3. My other issue is a question. The No-semantics means that
>>> there are no semantic conditions on the graphs. However, looking
>>> at first table of semantic conditions in section 1.4 of the RDF
>>> Semantics document, although that section says 'denotation of
>>> ground graphs', the first four condition is not dependent at all
>>> on blank nodes, it just defines a bit what a graph structure is.
>>
>> Seems possible too. I guess it does not matter that much, as the
>> no-semantics has only one trivial entailment: all graphs entail
>> themselves and only themselves.
>>
>
> As I said, it is not very important, but the role of these semantic
> conditions, at least for me, is also to restrict what can be a model
> at all and, through that, describe some sort of a constraints on how
> the data can be organized if it is truly RDF.

Ok. I'll work on it.

>
> Cheers
>
> Ivan
>
>
>>> I believe that can be safely added to the No-semantics as a
>>> semantic condition. I *think* I understand why the other two are
>>> not added as conditions: we do not want to get into the blank
>>> node store at this point.
>>>
>>> Whether we expand the no-semantics with those or not is not
>>> *really* important, though; I just wanted to understand why those
>>> were left out.
>>>
>>> Thanks
>>>
>>> Ivan
>>>
>>>
>>> [1]
>>> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Dataset-semantics-2.0
>>>
>>>
[2] http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Dataset-semantics
>>>
>>>
>>> On Aug 22, 2012, at 16:57 , Antoine Zimmermann wrote:
>>>
>>>> So, I made a new wiki page in a tentative to reconciliate the
>>>> different semantics.
>>>>
>>>> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Dataset-semantics-2.0
>>>>
>>>>
>>>>
Basically, I define a super-weak semantics of RDF graph, which can
>>>> be used as an underlying entailment regime for the dataset
>>>> semantics of [1].
>>>>
>>>> Then, I put an example of a possible vocabulary to allow more
>>>> expressiveness. The vocabulary is mirroring some of the terms
>>>> of SPARQL 1.1 service descriptions [2].
>>>>
>>>> This truly makes the "base" semantics very very weak but allows
>>>> one to extend it to any variant on top of it.
>>>>
>>>>
>>>> [1]
>>>> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Dataset-semantics
>>>> [2]
>>>> http://www.w3.org/TR/2009/WD-sparql11-service-description-20091022/
>>>>
>>>>
>>>>
>>>>
>>
>>>>
AZ
>>>>
>>>> Le 22/08/2012 16:33, Ivan Herman a écrit :
>>>>>
>>>>> On Aug 22, 2012, at 15:54 , Pat Hayes wrote:
>>>>>
>>>>>>
>>>>>> On Aug 22, 2012, at 2:04 AM, Ivan Herman wrote:
>>>>>>
>>>>>>>
>>>>>>> On Aug 21, 2012, at 21:48 , Pat Hayes wrote: [snip]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Antoine, I have the impression that we are actually
>>>>>>>>> in agreement. The document we have put forward has
>>>>>>>>> two essential points:
>>>>>>>>>
>>>>>>>>> - we would have a default semantics in the form of
>>>>>>>>> the quoting semantics
>>>>>>>>
>>>>>>>>
>>>>>>>> Whoa. I do not know what y'all mean by a "default
>>>>>>>> semantics". Is this a default that can be overridden?
>>>>>>>> If so, I know of NO semantic theory  anywhere in logic
>>>>>>>> or linguistics that can provide this. If y'all want
>>>>>>>> this, you are on your own.
>>>>>>>>
>>>>>>>> If not, what exactly is it supposed to mean?
>>>>>>>>
>>>>>>>> Pat
>>>>>>>
>>>>>>>
>>>>>>> What I meant is: this is the semantics that is
>>>>>>> standardized to be valid in the absence of any other
>>>>>>> indication. I did not say anything else.
>>>>>>
>>>>>> And Antoine agrees. OK, then a better term would be "weak"
>>>>>> or "minimal" semantics. "default" sounds like
>>>>>> nomonotonicity (being overridable) to me.
>>>>>
>>>>> Agreed, sorry for my sloppiness. 'Minimal' sounds indeed good
>>>>> to me.
>>>>>
>>>>> Ivan
>>>>>
>>>>>
>>>>>>
>>>>>> Pat
>>>>>>
>>>>>>>
>>>>>>> Ivan
>>>>>>>
>>>>>>>
>>>>>>> ---- Ivan Herman, W3C Semantic Web Activity Lead Home:
>>>>>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153
>>>>>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------
>>>>>>
>>>>>>
IHMC (850)434 8903 or (650)494 3973 40 South Alcaniz St.
>>>>>> (850)202 4416   office Pensacola (850)202 4440   fax FL
>>>>>> 32502 (850)291 0667   mobile phayesAT-SIGNihmc.us
>>>>>> http://www.ihmc.us/users/phayes
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> ---- Ivan Herman, W3C Semantic Web Activity Lead Home:
>>>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 FOAF:
>>>>> http://www.ivan-herman.net/foaf.rdf
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol
>>>> École Nationale Supérieure des Mines de Saint-Étienne 158 cours
>>>> Fauriel 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66
>>>> 03 Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/
>>>>
>>>
>>>
>>> ---- Ivan Herman, W3C Semantic Web Activity Lead Home:
>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 FOAF:
>>> http://www.ivan-herman.net/foaf.rdf
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>> -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École
>> Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel
>> 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03
>> Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/
>>
>
>
> ---- Ivan Herman, W3C Semantic Web Activity Lead Home:
> http://www.w3.org/People/Ivan/ mobile: +31-641044153 FOAF:
> http://www.ivan-herman.net/foaf.rdf
>
>
>
>
>
>
>

-- 
Antoine Zimmermann
ISCOD / LSTI - Institut Henri Fayol
École Nationale Supérieure des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 66 03
Fax:+33(0)4 77 42 66 66
http://zimmer.aprilfoolsreview.com/
Received on Thursday, 23 August 2012 15:32:32 UTC