Re: Semantics for stateful resources from Richard Cyganiak on 2012-05-24 (public-rdf-wg@w3.org from May 2012)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Thu, 24 May 2012 01:45:36 +0100
To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Cc: Pat Hayes <phayes@ihmc.us>, RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <77B96159-B8F9-4AE7-BD5F-47B1FF68EFFF@cyganiak.de>
Hi Antoine,

On 24 May 2012, at 00:36, Antoine Zimmermann wrote:
> Quick reply:
> 
> if I have the following TriG file:
> 
> 
> { #default graph
>  :Employee  rdfs:subClassOf  :Person .
>  :Unemployed  rdfs:subClassOf  :Person .
> }
> :year2008 {
> :Joe  :worksFor  :AcmeCorp .
> :worksFor  rdfs:domain  :Employee .
> }
> :year2009 {
> :Joe  a  :Unemployed .
> :Unemployed  owl:disjointWith  :Employee .
> }
> 
> 
> 1. Would this be entailed:
> 
> :year2008 { :Joe  a  :Employee }
> 
> yes/no?

No.

But under the proposal, the graph names denote “stateful resources”, which have “state extensions”, which are sets of interpretations.

And (assuming RDFS-entailment) all the interpretations in :year2008 satisfy the triple { :Joe a :Employee }. This isn't quite the same as entailing the state pair you ask, but close.

> 2. and this:
> 
> :year2008 { :Joe  a  :Person }

No. Some, but not all, interpretations in :year2008 satisfy the triple { :Joe a :Person }. So we cannot rule it out, but can't confirm it either.

The presence of additional triples in the default graph, like the subclass triples here, doesn't affect the state extension of a stateful resource.

> 3. Would it be inconsistent?
> 
> yes/no?

No. But there's (assuming OWL-entailment) no interpretation in :year2008 that satisfy the state of :year2009, and vice versa.

> Considering that "graph changes over time" is a *PRIORITY A* use case, and this in fact applies to all sorts of dimensions of context (including provenance, also in the high priorities---where are inferences coming from?), if these inferences are non-entailments, there will be extremely important use cases not addressed by the design.

Why?

Case 1) has nothing to do with change over time. It's about whether we want to record what someone *said*, or what we assume they *meant* under our entailment regime. And I'd argue that keeping track of provenance requires that we know *exactly* what someone said, and not what we inferred from what they said.

Case 2) has nothing to do with change over time either. It's about whether every named graph entails the default graph, if I understand you correctly. And I think it shouldn't, because adding a graph as a named graph to a dataset shouldn't change the meaning of that graph.

For case 3), are you saying that it *should* be inconsistent? I think that dealing well with changes over time requires that it is *not* inconsistent.

Is this the latest version of your proposal?
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal#Semantics

It seems to me that one inconsistent graph in this proposal makes the entire dataset inconsistent. Is this correct? Wouldn't this go against the use case of web crawling, where we have to assume the presence of broken data?

If I wanted a semantic extension that enforces web semantics (that is, the IRI-graph-pair <i,G> causes an inconsistency if dereferencing i doesn't yield G), how would I express this extension in your semantics? 

> Therefore, either there must be a semantics where you can make inferences inside a "named" graph and put back the inferred statement inside the graph (for instance, the semantics I proposed, but it's not the only way) or we'd better not normalise any semantics as pfps suggests to do. It is also possible to have 2 semantics, it's been done in OWL.

What I tried to address in the semantics is Pat's request that we must be explicit about what the IRIs in the IRI-graph-pairs denote. That's all the proposal below answers, really: they denote (indirectly, via an extension) the set of all interpretations of the graph.

Best,
Richard




> 
> 
> Best,
> AZ
> 
> Le 24/05/2012 01:09, Richard Cyganiak a écrit :
>> Pat,
>> 
>> On 23 May 2012, at 19:41, Pat Hayes wrote:
>>> Richard, I am confused.
>> 
>> No, in this case, I am. I used the phrase “identifying subgraphs”
>> sloppily in my mail to Yves because I echoed the wording of the
>> original issue. I should have said “managing subgraphs” or something
>> like that.
>> 
>> But let's talk, this is interesting.
>> 
>>> Sometimes I get the sense that you want the graph names to refer
>>> not to graphs as such, but rather to 'stateful resources' (or
>>> whatever) which have a robust identity and emit graphs when poked,
>>> a REST-inspired kind of a thing.. (Cf. your responses on other
>>> threads.)
>> 
>> Yes, this. Well, I'd weaken that a bit: I'd like graph names to
>> denote things with robust identity that somehow have a graph
>> associated. “Emit graph when poked” is one particular and probably
>> the most useful kind of association, but I'd like to be less specific
>> about the nature of the association. The less specific we are, the
>> closer we remain to SPARQL semantics, and the less we get in the way
>> of current practice. Tighter definitions can be done as semantic
>> extensions where required.
>> 
>>> At other times, however (as here) you seem to want the graph names
>>> to refer to an actual set of triples, a true Platonic RDF graph.
>> 
>> No, not in general.
>> 
>> Although “is same as” is just another kind of association, and that
>> can be sometimes useful too. If it works as a tortured edge case,
>> then that's a plus.
>> 
>>> It really does matter which we choose, and I don't see how we can
>>> choose both (or not without a lot of new machinery to make the
>>> distinction, that we have not even discussed yet) and I don't think
>>> it is viable to just be muddled or ambiguous about it, as that is
>>> the muddle we are in already and are trying to get straight.
>> 
>> I believe we can do both. I'll try to show how. I'm an amateur at
>> this stuff, so forgive me if it's a horrible mess, but it might be
>> enough to give you an idea where I'm trying to go with this “stateful
>> resource” and “state relationship” business.
>> 
>> A DS-interpretation is a simple interpretation plus a “state
>> relationship”, let's call it ISREL, that contains pairs of resources
>> and graphs.
>> 
>> We could say that if<x,G>  is in ISREL, then x is an
>> rdfs:StatefulResource.
>> 
>> And a state pair<i,G>  is true iff<I(i),G>  is in ISREL.
>> 
>> Borrowing an idea from Antoine's proposed semantics, I think that
>> every rdfs:StatefulResource should have an associated set of
>> interpretations, let's call that the “state extension” of the
>> stateful resource, that contains exactly the interpretations that
>> satisfy the graph associated by the state relationship.
>> 
>> Something along these lines would probably be the maximum amount of
>> normative semantics I'd go along with for datasets.
>> 
>> But if this works as I hope, then this would give us a base from
>> which one could go further. If we want to rigidly denote graphs, for
>> example, then we can define a semantic extension that imposes an
>> additional semantic condition: For every<x,G>  in ISREL, x = G.
>> Done!
>> 
>> Or, if we want to capture “web semantics”, so that a state pair is
>> true iff prodding x yields the graph G, then that's a different
>> semantic extension with a different additional condition:<I(i),G>  is
>> in ISREL if and only if dereferencing i and parsing as RDF yields
>> graph G.
>> 
>> This keeps the semantics of different graphs in the dataset entirely
>> separate. As you know, I think this is a feature. However, I suppose
>> that again, additional semantic conditions could change this. I can
>> definitely see how it could be useful in the case of “web semantics”
>> to require that the names of stateful resources actually denote the
>> resource in all the interpretations in the state extension. I suppose
>> this could be imposed by requiring that all these interpretations in
>> the state extensions share at least the state relationship with the
>> “main” interpretation.
>> 
>>> For example, if the graph names refer to stateful resources, then
>>> there are two rather different ways to identify a subgraph or a
>>> larger graph. ONe is to speak of a subset (defined somehow) of the
>>> graph that is the current state of the stateful resource, the other
>>> is to have a relation between two resources such that one returns a
>>> subset of what the other returns, at any time. These behave
>>> differently and would need to be implemented differently.
>> 
>> The second approach sounds better to me because the relationship
>> between names and graphs is the same for both the larger graph and
>> the subgraph. As I said, I think that being noncommittal about the
>> actual nature of the relationship between resource and graph (is it
>> identity, dereference, or something else?) is a feature.
>> 
>>> I have no axe to grind here. I would be quite happy if we were to
>>> declare that graph names in datasets always refer to stateful
>>> resources.
>> 
>> Then let's go with that.
>> 
>>> I would also be happy if we decide they always refer to graphs.
>> 
>> I think *always* doing that is unacceptable. That's because of the
>> case where I want to fetch RDF from the web and stick it into a
>> dataset using the source URL as a graph name. The source URL denotes
>> something out there on the web (an RDF document probably); it
>> certainly doesn't denote a graph. So I'm contradicting the web.
>> 
>>> But I am not happy about it being ambiguous or undecided. I do feel
>>> that it is very important that we choose one story and stick to it.
>>> Which one do you want to pitch for?
>> 
>> I feel that the semantic model *needs* an indirection between the
>> denoted resource and the graph.
>> 
>> (What we call the class of denoted resources, and what we call the
>> relationship to the graph, then becomes a somewhat secondary
>> question. I'm currently trying to see whether “stateful resource” and
>> “state” will stick, but that's not actually so terribly important.)
>> 
>> Best, Richard
>> 
>> 
>> 
>>> 
>>> Pat
>>> 
>>> 
>>> 
>>> On May 23, 2012, at 1:12 PM, Richard Cyganiak wrote:
>>> 
>>>> Hi Yves,
>>>> 
>>>> I took an action to propose some informative wording regarding
>>>> the possibility of identifying subgraphs of a larger graph. See
>>>> below for a first attempt. I suppose this would go somewhere near
>>>> the definition of “RDF dataset” or whatever we end up calling
>>>> these things. The terminology (named graphs etc.) still may have
>>>> to change of course. Is this wording ok for you?
>>>> 
>>>> Best, Richard
>>>> 
>>>> 
>>>> [[ Note: Graphs in an RDF dataset may overlap. The same
>>>> underlying set of triples may be divided up into named graphs
>>>> along multiple dimensions (such as data owner or subject area) by
>>>> repeating each triple in multiple graphs. Whether such a setup
>>>> would be realized by storing each triple multiple times, or
>>>> through views of some sort, is up to the implementation. ]]
>>>> 
>>> 
>>> ------------------------------------------------------------ IHMC
>>> (850)434 8903 or (650)494 3973 40 South Alcaniz St.
>>> (850)202 4416   office Pensacola
>>> (850)202 4440   fax FL 32502                              (850)291
>>> 0667   mobile phayesAT-SIGNihmc.us
>>> http://www.ihmc.us/users/phayes
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 
> 
> 
> -- 
> Antoine Zimmermann
> ISCOD / LSTI - Institut Henri Fayol
> École Nationale Supérieure des Mines de Saint-Étienne
> 158 cours Fauriel
> 42023 Saint-Étienne Cedex 2
> France
> Tél:+33(0)4 77 42 83 36
> Fax:+33(0)4 77 42 66 66
> http://zimmer.aprilfoolsreview.com/
>
Received on Thursday, 24 May 2012 00:46:08 UTC