Re: Minimal dataset semantics from Pat Hayes on 2012-08-27 (public-rdf-wg@w3.org from August 2012)

From: Pat Hayes <phayes@ihmc.us>
Date: Mon, 27 Aug 2012 12:29:48 -0500
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Sandro Hawke <sandro@w3.org>, RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <F19FA84C-EC21-41F2-A950-37FDFC06F63C@ihmc.us>
On Aug 27, 2012, at 5:56 AM, Richard Cyganiak wrote:

> Hi Pat,
> 
> On 25 Aug 2012, at 15:49, Pat Hayes wrote:
>>> A predicate IRI denotes a resource (known as a "property") that has a property extension that is a binary relation. A graph IRI denotes a resource (known as a ... g-box? RDF space? graph source? RDF data source?) that has a graph extension that entails the graph.
>> 
>> Ah, OK. Ive got that clear now. But just to make sure I am on the same page, is this relationship between the resource and the graph is exactly like (is an instance of?) that between other kinds of resource and the representation that HTTP GET returns when you poke the resource? Or is it something else, unique to RDF?
> 
> Formally, I would simply say that a DS-interpretation has a mapping IGEXT from resources in IR to RDF graphs. I would not formally constrain it any further than that.
> 
> But for interoperability, one SHOULD assume that this mapping has the HTTP-GET-and-parse-as-RDF function as a subset.
> 
> So if one wanted to better formalize the architecture of the Semantic Web, then one could state this semantic condition: If dereferencing an IRI i yields a representation in an RDF format, and that representation can be parsed to an RDF graph G, and i denotes X (that is, I(i)=X), then <X,G> is in IGEXT. This is how IGEXT can be grounded in the web. I'm not suggesting that we go this far in this WG though.

I wish we would do something like this. I think it would be doing the world a huge favor. 

> 
> Regarding the question of open-graph vs. closed-graph semantics discussed further down in this message, I'll have to mull this over a bit, and won't have time for this today. For now, I'd just like to understand how closed-graph semantics would work formally. If we take Antoine's formalization here as a starting point (ignoring the orthogonal question of IRI-GEXT vs RES-GEXT for now):
> http://lists.w3.org/Archives/Public/public-rdf-wg/2012Aug/0238.html
> 
> This one is open-graph because of this line:
> 
>  - for an IRI n and a graph G, I(n,G) is true iff there exists a graph G' such that (n,G') is in GEXT and G' E-entails G;
> 
> To make it closed-graph, would this simply be changed to this?
> 
>  - for an IRI n and a graph G, I(n,G) is true iff (n,G) is in GEXT;

Yes. Or we might want to allow a little slack, using the previous formulation but with "G' graph-equivalent to G" at the end. 

> 
> Furthermore, just to be sure, we do assume that GEXT is a function, that is, if (n,G1) and (n,G2) are in GEXT then G1=G2, right?

Yes. We *could* relax this but it would wreak havok with entailments. 

> 
> And finally, is it correct to say that it would be highly desirable to define the “minimal” semantics in a way that semantic extensions would generally be monotonic, that is, all entailments that hold in the “minimal” semantics should still hold under the semantic extension?

Absolutely yes. 

Pat

> 
> Best,
> Richard
> 
> 
> 
>> FWIW, my (vague, intuitive) understanding of the REST http story would say that the IRI denotes a resource which when http poked with GET returns a "representation" comprising some kind of byte stream which parses to the RDF graph. SO there are three things involved here: the resource itself (denoted by the IRI), the representation of its immediate state which is returned as the payload of a GET response, and the (abstract) RDF graph that this byte stream parses to, roughly analogous to the DOM tree of an XML document. But along the lines of being relaxed about terminology, I am cool with calling any of these a "graph", provided we can qualify this respectively as "graph resource" (?), "graph representation" (?) and "abstract RDF graph" (?) respectively. 
>> 
>>>>> Regarding the formal mechanism of associating IRIs with their respective graphs, I've started to like the idea that Alan mentioned the other day: Dataset interpretations contain a mapping from resources to graphs, called the graph extension. This mapping associates graphs with (some) resources. A name-graph-pair (a.k.a. abstract named graph) <i,G> satisfies a dataset interpretation I if the graph extension in I of I(i) entails G.
>>>> 
>>>> I am cool with that, although it does have some consequences that y'all might find odd when you try mixing it with owl:sameAs. Do you want this:
>>>> 
>>>> :a owl:sameAs :b
>>>> { :a { :this :is :graph}}
>>>> 
>>>> to entail 
>>>> 
>>>> {:b {:this :is :graph}}
>>> 
>>> Well, at the start of this WG I would have said that this would be an atrociously bad idea, but now it seems logical to me.
>> 
>> :-)  See, its actually a kind of disease, and now you are infected. 
>> 
>>> If :a and :b denote the same thing, then of course this will hold.
>>> 
>>>>> If I'm not messing up, then this mechanism works the same as the property extension and class extension mechanisms that already exist in RDF Semantics.
>>>> 
>>>> Its in the same spirit, one might say, yes. The difference is that property and class extensions are kind of abstractions, so it is (maybe with a slight forcing, but one gets used to it) OK to say in those cases that the resource *is* the class or property; but in this case, I think that to identify the resource with the graph would be too much of a stretch (and would negate the intended use cases, in any case). So its not *exactly* similar, in practice. 
>>> 
>>> Ok, yeah.
>>> 
>>>>> So, the graph IRI then *denotes* a resource (one that is in the domain of the class extension function). And in the abstract syntax, the graph IRI is "associated" or "paired" with a certain RDF graph.
>>>> 
>>>> The IRI or the resource is paired with it? I think its the denoted resource, yes?
>>> 
>>> The abstract syntax of RDF datasets contains <iri,graph> pairs. SPARQL calls them "named graphs". So, trivially, it is correct to say that in the abstract syntax, *IRIs* are *paired* with *graphs*, because they appear in the same <iri,graph> tuple. That is all.
>> 
>> Ah, OK. Yes, you did say in the abstract syntax, sorry. (I was thinking of the distinction that Antoine described in another email.)
>> 
>>> (That does not make the IRI denote the graph.)
>>> 
>>> (s/class/graph/ in what I wrote above.)
>>> 
>>>>> I don't know if one can say that the IRI "refers to" the entailment closure of the graph.
>>>>> 
>>>>> Does this make any sense?
>>>> 
>>>> Yes, it makes good sense. I would avoid saying that the IRI denotes the graph, in this case.
>>> 
>>> Right. If I said that, I misspoke. The IRI denotes a ... g-box? RDF space? graph source? RDF data source? Information resource? Some kind of resource anyways.
>> 
>> Im still slightly confused here. What is the relationship between this resource and the graph? Is there a semantic relationship, or an http/GET relationship, or is it just that they both reference the same IRI?
>> 
>>> 
>>>> ITs rather that there is relation beween them, we need a name for it, lets say that the IRI "indicates" the graph. That is, I indicates G whern I denotes X associated with G
>>> 
>>> Right. This relation is the one that holds between the graph IRI and the graph in a <iri,graph> pair in a SPARQL named graph. I guess the term you're looking for here, e.g., "indicates", would only show up in the formalization of the semantics. It would take some careful explaining that the relationship is not "naming" despite the <iri,graph> pair being called a "named graph" and the IRI often being called a "graph name" or "graph IRI".
>> 
>> We could call it "naming" but I think this would cause even more confusion unless we were very careful to explain all that stuff, and even then.
>> 
>> 
>>>> Why do you want to have entailment closures rather than graphs? Bear in mind that a simple bug in a graph can make the RDFS or OWL entailment closure be infinite. I think precision outweighs convenience here. 
>>> 
>>> Basically, because I'm interested in explaining various useful RDF operations as semantic extensions.
>> 
>> Well, OK, but one can always just talk about the entailment closure explicitly. Instead of saying 
>> 
>> X denotes the closure of the graph
>> T is in X
>> 
>> you can say
>> 
>> X denotes the graph
>> T is in the entailment closure of the graph
>> 
>> and then at least the graph is what you say it is :-)
>> 
>>> 
>>> Consider an entailment regime where we add a semantic condition that the "graph extension" mapping must be the dereference+parse function. Boom, we have formally described a dataset that contains the entire web of data. So we can explain follow-your-nose as a semantic extension, and we have a theoretical foundation for all sorts of work around provenance, access control, and the like.
>> 
>> Thats an interesting idea, but it would work just as well with the other way of talking.
>>> 
>>> Or consider an entailment regime where I can say things like ":g1 ex:isUnionOf (:g2 :g3 :g4)" in the default graph. Can we define a semantic extension that makes it so that the graph indicated by :g1 is now equivalent to the union of those graphs?
>> 
>> Yes, but we can do this either way (the :g's can be closures or not)
>>> 
>>> Or formalize owl:import as a semantic extension? If :g1 owl:imports :g2, then an interpretation of the graph indicated by :g1 must also satisfy :g2 (or must satisfy the union or merge of :g1 and :g2).
>> 
>> Thats already the right semantics for imports :-) 
>> 
>>> 
>>> Maybe I'm wrong, but these kinds of semantic extensions don't seem to be possible with closed-graph semantics, because we already have said that the triples explicitly included in the named graphs are all there is.
>> 
>> No, we have just said that they are the only ones *in the named graph*. We havnt said anything at all about what triples are in the entailment closures of the named graphs. 
>> 
>>> 
>>> With open-graph semantics on the other hand, it seems possible to "make additional triples show up" in the named graphs via entailment, and it seems possible to combine multiple of these semantic extensions without them stepping on each others toes too much.
>>> 
>>> (All of the above may seem far-fetched without additional explanations. I'll try to expand on at least one of these later.)
>>> 
>>> Summary: My (possibly dead wrong) impression is that the closed-graph semantics provides no useful entailments and cannot be usefully extended; it provides nothing of interest that isn't already in the abstract syntax.
>> 
>> What it provides is that you know exactly what graph you are getting when you use the name to identify/refer to/denote the graph. For some purposes, this is centrally important (eg provenance). But this does not stop the graph having other entailments. 
>> 
>>> The open-graph semantics on the other hand provides at least *some* interesting entailments, and seems to provide many opportunities for extensions that are interesting and useful.
>>> 
>>> Maybe my underlying assumption -- that open-graph semantics is more friendly to semantic extensions -- is wrong; in that case I'd have to re-evaluate my position of preferring open-graph semantics.
>> 
>> I think it is wrong, yes. 
>> 
>> Pat
>> 
>>> 
>>> Best,
>>> Richard
>>> 
>> 
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 or (650)494 3973   
>> 40 South Alcaniz St.           (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Monday, 27 August 2012 17:30:22 UTC