Re: [GRAPH] graph deadlock? from Ivan Herman on 2011-12-24 (public-rdf-wg@w3.org from December 2011)

From: Ivan Herman <ivan@w3.org>
Date: Sat, 24 Dec 2011 09:39:33 +0100
To: Pat Hayes <phayes@ihmc.us>
Cc: Andy Seaborne <andy.seaborne@epimorphics.com>, public-rdf-wg@w3.org
Message-Id: <66969DB1-0026-425E-83F5-030930D5645F@w3.org>
On Dec 23, 2011, at 19:00 , Pat Hayes wrote:

> 
> On Dec 21, 2011, at 5:18 AM, Andy Seaborne wrote:
> 
>> On 21/12/11 08:53, Ivan Herman wrote:
> 
> And there are responses to both of them inline below. 

[snip]

>> 
>> 
>> Specifically, the SPARQL Query spec says about the FROM NAMED syntax
>> """
>> The FROM NAMED syntax suggests that the IRI identifies the corresponding
>> graph, but the relationship between an IRI and a graph in an RDF dataset
>> is indirect. The IRI identifies a resource, and the resource is
>> represented by a graph (or, more precisely: by a document that
>> serializes a graph). For further details see [WEBARCH].
>> """
> 

I must admit I did not remember this quote...

> Sure sounds like it is saying that the IRi names a graph container.  But I now think that this is in fact irrelevant to SPARQL and is a misleading paragraph.

Misleading may be too harsh, but incomplete and, probably, irrelevant to SPARQL indeed.


> AFAIKS, all that SPARQL requires is that the IRI is paired with the graph in the dataset. It doesn't need to even mention any semantic relationship such as 'naming' between the IRI and the graph or graph container, nor does it require that the 'naming' IRI in this pair identify anything related to the graph, no matter how indirect this might be. (It might indeed have been better for everyone if SPARQL had simply shied away from using semantic terminology altogether.)

Pat, I guess I agree with you...

> 
>> 
>> The RDF dataset definition is more general.
>> 
>> """
>> Definition: RDF Dataset
>> 
>> An RDF dataset is a set:
>> 
>>  { G, (<u1>, G1), (<u2>, G2), ... (<un>, Gn) }
>> 
>> where G and each Gi are graphs, and each <ui> is an IRI.
>> """
>> 
>> It adds:
>> """
>> Each <ui> is distinct.G is called the default graph. (<ui>, Gi) are called named graphs.
> 
> I would add that nowhere does it say that there is any relationship between Gi and <ui> , other than that they co-occur in a pair with a somewhat evocative name. It does not specify that <ui> denote or name or refer to G in any way, or indeed have any connection to it other than it is the same pair in this dataset. Which is exactly how people are using it, of course, as Richard and Antoine have been emphasizing. 
> 
> So – and perhaps this is what you, Ivan, have been advocating all along – we should distinguish actual referential naming of a graph (container) by an IRI, from the IRI/graph(container) relationship described or specified in a dataset, which is evidently not that of reference or naming (as the word is usually used) or what is usually called 'identification' of a resource by an IRI.

Yes. I see these two as distinct usages out there and we cannot get around the fact that the genie is out of the bottle...

>  
> 
> However we are still left with the issue of what these IRIs are supposed to refer to when they are used in an RDF triple, as opposed to the 4th field of a quad store or in a SPARQL-defined RDF dataset 'named graph' pair. And here we have the central (and it seems to me the only important) issue, which is how to reconcile the obvious need to use the IRI to refer to the graph (in RDF metadata, and as several of us have been doing in these email threads) and the fact that they may also denote something else altogether, and the fact that they can't do both of these at the same time.
> 
> The only way I can see around this (apart from choosing to ignore it – which I am presuming is a course of last resort – or making some aspect of it non-conformant and swallowing the resulting discomfort) is to allow IRIs in RDF to be treated as punning (AKA overloading), under some circumstances, with the syntactic context of use determining the resolution of the punning ambiguity. The simplest way would be to restrict this to this special case of metadata in the default graph of an RDF dataset, but I think we could try to come up with a more general framework that might be of wider utility. I will take up that task in another thread, maybe after Xmas.

I am really curious to see you elaborate on this (thanks for the Xmas present). But what I was saying is that if an application stays on what I called 'labelled graph', then there is no relationship whatsoever, and we may simply ignore the issue. 

I realize that we have to say _something_ on labeling after all, eg, the fact that a label must be unique within a dataset. Ie, it is a little bit more than my simplistic view. But I would still try to keep the extra requirements to the absolute, strict minimum like that. The good practice that we should promote is that for any more complex applications what I called 'identified graphs' should be used, not 'labelled graphs'. And we should make the entry point to 'labelled graphs' as easy as possible.
> 
>> """
>> which I'd prefer, in hindsight, to drop or at least move out of the definition.
>> 
>> It adds nothing that affects the rest of the definition of SPARQL.  I'd even argue that it was "editorial", not "substantial", to the definition of SPARQL so does not invalidate the last call.
>> 
>>> For the sake of the discussion we may have to use different terms,
>>> and let us forget about SPARQL for a while.
>> 
>> Oops :-)
>> 
>>> Although I do not have
>>> the Sandro's talent of finding nice terms, let us say that we speak
>>> about labelled graphs, i.e., datasets, and identified graphs. Let us
>>> not use the term named graphs for a while...
>>> 
>>> - Labelled graphs are the minimal level, ie, just using URI-s
>>> labeling graphs. We may have to add a restriction that, within a
>>> dataset, labels are unique, ie, two different graphs must have two
>>> different labels. No further assumptions are used. And, to refer to
>>> an earlier quote of yours up there, the labels used that way would
>>> take no part in any form of RDF interpretation whatsoever.
>>> 
>>> - Identified graphs are labelled graphs where there _is_ a relation,
>>> through HTTP GET, between the label and the graph.
> 
> Neither of these establishes a semantic naming relation between the IRI and the graph. Labelled graphs are just graphs which are paired with an IRI in an RDF dataset. That has nothing, prima facia, to do with the IRI denoting or naming the graph.

Correct.


> It certainly does not make the IRI into a name for the graph (or even a label for the graph, in any global sense: the 'labeling' is solely restricted to this dataset.)

Correct again. I used the term 'naming' because, well, SPARQL uses this already...


> And the need to distinguish between 'identifying' in the REST/HTTP-Web sense, and naming in the semantic sense, has been notorious now for almost a decade: this is exactly what the http-range-14 debates have all been about. They are not the same idea: if we want them to coincide, we have to say so. 

My (personal) favourite is to go through the REST/HTTP way, and stop there. I am not sure what it would take to say 'they coincide', though. I let you explain that...

[snip]

>>>> 
>>>> Fine. But when they also occur in the (say) 3rd position, do they
>>>> or do they not then mean the same as they meant when they occur in
>>>> the fourth position? (Or maybe: does what they mean in the 3rd
>>>> position have any relationship at all to their role while
>>>> being-there in the fourth position?)
>>> 
>>> If the quad store implements a labelled graph than the answer is no.
>>> More exactly, an application should have no assumption that they do.
> 
> OK, but others seem to disagree. I doubt we can ever get widespread agreement on this. People *will* assume that there is a relationship; and moreover, this assumption is extremely natural. 
> 

We have identified graphs for that. And maybe (hopefully!), eventually, all usage of the notorious URI-s for graphs will be within the framework of identified graphs; we can then all be happy. But the reality of today is different, and we should document that.



>>> 
>>>> The answer seems to be, sometimes they do (of course) and sometimes
>>>> they don't (of course), but nothing records which case is which.
>>> 
>>> Right.
> 
>>> A quad store, or an application thereof, might declare that it
>>> uses labelled or identified graphs. Well.. probably should/must and
>>> not might.
> 
> How can it declare this? If we say it SHOULD, or even that it MAY, do this, we have to provide a standard way to do it, that everyone can recognize. 
> 

Good question, I am not sure. Although... there is no way to determine today whether a graph store, say, implements RDFS entailement, except for the tool's documentation...

>>> 
>>> 
>>>> And I object to that situation, as it produces faux-RDF which is
>>>> designed to be systematically ambiguous in meaning.
>>> 
>>> And I hear your objection. Today an application or a quad store has
>>> no means to say which way it goes. Hence the mess... That is the
>>> absolute minimal step that I would like to make to try to clarify
>>> things.
> 
> Seems we agree on this :-)

:-)

Have a merry Xmas, Pat, if you read this before the festivities begin...

Ivan


> 
> Pat
> 
> ------------------------------------------------------------
> IHMC                                     (850)434 8903 or (650)494 3973   
> 40 South Alcaniz St.           (850)202 4416   office
> Pensacola                            (850)202 4440   fax
> FL 32502                              (850)291 0667   mobile
> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
> 
> 
> 
> 
> 
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Saturday, 24 December 2011 08:39:32 UTC