W3C home > Mailing lists > Public > public-rdf-wg@w3.org > December 2011

Re: [GRAPH] graph deadlock?

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Fri, 23 Dec 2011 13:23:11 -0500
Message-ID: <4EF4C70E.20805@openlinksw.com>
To: public-rdf-wg@w3.org
On 12/23/11 1:00 PM, Pat Hayes wrote:
> On Dec 21, 2011, at 5:18 AM, Andy Seaborne wrote:
>> On 21/12/11 08:53, Ivan Herman wrote:
> And there are responses to both of them inline below.
>>> On Dec 20, 2011, at 19:45 , Pat Hayes wrote:
>>>> On Dec 20, 2011, at 2:29 AM, Ivan Herman wrote:
>>>>> Pat,
>>>>> On Dec 20, 2011, at 05:45 , Pat Hayes wrote:
>>>>> [skip]
>>>>>> Now, consider the case where a URI  UUU is used as a graph
>>>>>> label in a dataset, and also occurs in the RDF inside a graph
>>>>>> in that same dataset, where it is interpreted as denoting, say,
>>>>>> a human being or a mailbox. OK so far. Now, however, add the
>>>>>> dataset some more RDF (perhaps in the default graph used to
>>>>>> express some metadata, for example) in which that same URI is
>>>>>> intended to be used to refer to the graph that it labels. There
>>>>>> are *no* RDF interpretations in which a single URIref can
>>>>>> denote two different things. So this dataset as a whole has no
>>>>>> satisfying interpretations. So it is formally inconsistent.
>>>>>> Moreover, the inconsistency arises directly, and obviously,
>>>>>> from this usage in which a URI is used to "name" something
>>>>>> other than what everyone agrees it is in fact interpreted to
>>>>>> mean (as, vividly, in Ivan's example using an email address).
>>>>>> And this is, surely, *obviously* at odds with the basic
>>>>>> assumption of the entire Web, that URIs, when considered as
>>>>>> names, identify *one* thing.
>>>>> is 'labeling' and 'identifying' the same?
>>>> Well, maybe not. But I suspect that if we try to say this, nobody
>>>> will take the slightest notice. They certainly sound like they
>>>> ought to be very closely related, so closely that only philosophers
>>>> could distinguish them, and then only when there is an R in the
>>>> month.
>>> :-)
>>>> And by the way, SPARQL talks about these URIs *naming* the graph,
>>>> which sounds even more like identifying.
>>> So we indeed have a naming (sic!) issue. Indeed, SPARQL uses the term
>>> 'naming' for what I referred to as datasets. That is mess that,
>>> unfortunately, we have to live with it:-(
>> Specifically, the SPARQL Query spec says about the FROM NAMED syntax
>> """
>> The FROM NAMED syntax suggests that the IRI identifies the corresponding
>> graph, but the relationship between an IRI and a graph in an RDF dataset
>> is indirect. The IRI identifies a resource, and the resource is
>> represented by a graph (or, more precisely: by a document that
>> serializes a graph). For further details see [WEBARCH].
>> """
> Sure sounds like it is saying that the IRi names a graph container.  But I now think that this is in fact irrelevant to SPARQL and is a misleading paragraph. AFAIKS, all that SPARQL requires is that the IRI is paired with the graph in the dataset. It doesn't need to even mention any semantic relationship such as 'naming' between the IRI and the graph or graph container, nor does it require that the 'naming' IRI in this pair identify anything related to the graph, no matter how indirect this might be. (It might indeed have been better for everyone if SPARQL had simply shied away from using semantic terminology altogether.)


Therein lies the bug that ultimately bites anyone that tries to 
comprehend the above with RDF in mind.

>> The RDF dataset definition is more general.
>> """
>> Definition: RDF Dataset
>> An RDF dataset is a set:
>>    { G, (<u1>, G1), (<u2>, G2), ... (<un>, Gn) }
>> where G and each Gi are graphs, and each<ui>  is an IRI.
>> """
>> It adds:
>> """
>> Each<ui>  is distinct.G is called the default graph. (<ui>, Gi) are called named graphs.
> I would add that nowhere does it say that there is any relationship between Gi and<ui>  , other than that they co-occur in a pair with a somewhat evocative name. It does not specify that<ui>  denote or name or refer to G in any way, or indeed have any connection to it other than it is the same pair in this dataset. Which is exactly how people are using it, of course, as Richard and Antoine have been emphasizing.
> So  and perhaps this is what you, Ivan, have been advocating all along  we should distinguish actual referential naming of a graph (container) by an IRI, from the IRI/graph(container) relationship described or specified in a dataset, which is evidently not that of reference or naming (as the word is usually used) or what is usually called 'identification' of a resource by an IRI.

It's just a label in SPARQL, at least by default. Once statements are 
made about the Named Graph where triples coalesce around the Named Graph 
IRI, we have a Name, Referent, and a clear description expressed in RDF. 
This also sets the stage for statements to be made using vocabularies / 
ontologies that are oriented towards provenance and even fine-grained 
reification of each statement in the Named Graph. This ultimately 
enables one sign graphs.

Personally, I feel the lack of industry use is what makes some of this 
stuff a little more confusing that it needs to be. For instance, a good 
usecase for graph signing would unveil context and virtues of reification.

A usecase I have in mind is signing owl:sameAs relations in a specific 
Name Graph. This can be achieved by  relations that connect a WebID (a 
crytpographically verifiable agent URI) to a specific statement 
associated with a specific Named Graph IRI. Doing this prevents the 
transitive nature of owl:sameAs from compromising the integrity of 
WebID's verification protocol (which is based on link traversal). 
Basically, this prevents WebID from being susceptible to same fate 
that's befallen the CA network for conventional PKI.

> However we are still left with the issue of what these IRIs are supposed to refer to when they are used in an RDF triple, as opposed to the 4th field of a quad store or in a SPARQL-defined RDF dataset 'named graph' pair.

We (and I suspect Anzo) currently treat the 4th column as just another 
label that tells our RDF imports what Named Graph IRI to use. That's it.

> And here we have the central (and it seems to me the only important) issue, which is how to reconcile the obvious need to use the IRI to refer to the graph (in RDF metadata, and as several of us have been doing in these email threads) and the fact that they may also denote something else altogether, and the fact that they can't do both of these at the same time.

In SPARQL I think the spec should be clearer (you point this out above). 
Basically, Named Graph IRIs are just labels. Anything beyond that is a 
vendor specific handling of said labels.

> The only way I can see around this (apart from choosing to ignore it  which I am presuming is a course of last resort  or making some aspect of it non-conformant and swallowing the resulting discomfort) is to allow IRIs in RDF to be treated as punning (AKA overloading), under some circumstances, with the syntactic context of use determining the resolution of the punning ambiguity. The simplest way would be to restrict this to this special case of metadata in the default graph of an RDF dataset, but I think we could try to come up with a more general framework that might be of wider utility. I will take up that task in another thread, maybe after Xmas.

After Xmas for sure, but I think Ivan's suggestion wraps up this matter 
if the SPARQL spec gets a little correction applied to the paragraph you 
identified higher up in this post.

Merry Xmas & Happy Holiday to everyone!




Kingsley Idehen	
Founder&  CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen

Received on Friday, 23 December 2011 18:23:34 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:46 GMT