Re: Blank Node Identifiers and RDF Dataset Normalization from Pat Hayes on 2013-02-26 (public-linked-json@w3.org from February 2013)

From: Pat Hayes <phayes@ihmc.us>
Date: Mon, 25 Feb 2013 22:18:09 -0600
To: Gregg Kellogg <gregg@kellogg-assoc.com>
Cc: Steve Harris <steve.harris@garlik.com>, Markus Lanthaler <markus.lanthaler@gmx.net>, "'William Waites'" <ww@styx.org>, <msporny@digitalbazaar.com>, <public-rdf-wg@w3.org>, <public-linked-json@w3.org>
Message-Id: <B1BAA868-EB15-47C1-BE77-C744B196462C@ihmc.us>
On Feb 25, 2013, at 5:52 PM, Gregg Kellogg wrote:

>> On Feb 25, 2013, at 9:45 AM, Steve Harris wrote:
>> 
>>> On 2013-02-25, at 13:00, Markus Lanthaler <markus.lanthaler@gmx.net> wrote:
>>> 
>>>>> For example:
>>>>> 
>>>>> SELECT * WHERE {
>>>>> ?g dc:date ?d .
>>>>> GRAPH ?g { ?x a foaf:Person }
>>>>> }
>>>> 
>>>> Given that it has been decided that graph labels do *not* denote the graph,
>>> 
>>> I believe it would be more correct to say that graph labels do not HAVE to demote the graph, they're allowed to if you want them to.
>> 
>> True, but we have no way to convey such a "want to" in RDF syntax. So whatever it is that the writer wanted, the reader has no way to know that. If the Web were telepathic, we would not need information transmission standards at all, as you could mind-project your desired meaning of all your byte streams. In the real world, however, we usually have to rely on specificaitons to provide us a clue as to how to interpret the things we read. According to our current specifications, when you read some RDF in a dataset which uses a URI which is also used as a graph label, you have no way to know whether or not the first use of the IRI is supposed to be related in meaning to the second use. 
> 
> Would it be reasonable to create a class, such as rdfs:NamedGraph, which could be asserted or inferred on an IRI to indicate that it denotes the graph which it names? This could have some inference rules, such that if the IRI is used in a subject or object of a graph, and is used to name a graph in the same dataset, that the IRI then denotes the graph it names? Consider something such as the following:
> 
> {?iri :p :o }
> ?iri { :s1 :p1 :o1 }
> 
> => { ?iri a rfs:NamedGraph }
> 
> Same for predicate and object relations, and probably if the IRI is used within a Named Graph as well as the default graph.
> 
> Providing such a type, if not the inference rules, would seem the minimum we should do to allow an author to indicate that a graph name actually denotes the graph it names.
> 

Hi Gregg

If only it were this easy. Let me take some time to try to explain the problem here. It all turns on what in logical/philosophical circles is called the use/mention distinction. 

There is what might be called a 'normal' way to use names, to refer to the objects they name. So for example when we say "Barack Obama is the President of the USA." we are *using* the name "Barack Obama" to refer to the man. This kind of use is so ubiquitous throughout language that it is almost invisible, but it is important to keep in mind that when we use a name, we are not talking about the name, but about whatever it is that the name refers to. So what to make of sentences like this: "Barack Obama rhymes with Osama" ? Obviously this isn't about the man, it's about his name. We do say things like this, but when we write them down, the convention is to use quotation (or some other lexical device such as indentation and off-setting) to mark that the normal case of names being used to refer is here being over-ridden, and that the name is not being used at all, but rather is being held up as an object itself, so that the sentence is about it. It is being *mentioned* rather than used: " 'Barack Obama' rhymes with 'Osama'. " The quote marks signal that the standard use of the name is cancelled and the name is being mentioned rather than used. 

RDF is designed to formalize this normal case of language use, where names - IRIs and literals - are being used to refer to something, rather than mentioned to be talked about. And there is an underlying global assumption, which is normative for RDF and on which the coherent use of RDF depends, that IRI names are global, so that whatever an IRI refers to, it is supposed to refer to that same thing every place and every time it is used. (There may well be some edge cases here, such as using the same IRI to refer to, say, a store's website and to the particular state that website is in at the time of IRI use; but even in cases like this, the *recommended* practice is to either distinguish these with subtly different IRIs or else to say that the IRI names a dynamic state-dependent 'resource'. Either way, the IRI-referent relationship is made to be global and fixed.)

OK, all that said, now look at your proposal, above. You want 

?iri a rdfs:NamedGraph .

to mean that the IRI denotes the graph which it labels. But that's a statement about the IRI, and according to the rules of how IRIs work in RDF, the subject of that triple is not the IRI, but whatever the IRI refers to (= denotes = is the name of), because RDF is using the IRIs, not mentioning them. (It could be interpreted as: whatever ?iri denotes, that has to be a named graph. But that only establishes that it is *some* named graph, not the particular named graph that the IRI is used to label.) Your suggested rules all *mention* the IRI, not use it to mention something else. And to mention anything in RDF, you need to use a name for the thing. So we would need some systematic way to refer to the IRI itself - something like what quoting does in written English -  to give us a name for the IRI. You can't just use an IRI in RDF to refer to itself (unless you know that it is *supposed* to refer to itself, because of some condition external to RDF; but then you can't also use it to refer to something else, such as a graph.) (A good test for this, by the way, is to ask yourself if your proposed rule still works when the IRI is replaced by ?someOtherIRI and you add the premis ?someOtherIRI owl:sameAs ?iri. It should, if it is proper RDF.) 

This is why its not feasible to state this stuff in RDF itself: we need some way to establish a sementic relationship between the IRI and something, rather than what the IRI refers to and something. And that is a real modification (or exception) to the basic RDF semantic rules, the ones in the basic graph interpretation table. So either we modify RDF to allow for this (which has been proposed) or we invent a way to name IRIs (also has been proposed) or we do it outside RDF by some kind of constraint stated in the specs (ditto.) But none of these has been able to fly. 

Thanks for trying, anyway :-)

Pat
Received on Tuesday, 26 February 2013 04:18:42 UTC