Contexts of use, a semantic idea from Pat Hayes on 2011-10-13 (public-rdf-wg@w3.org from October 2011)

From: Pat Hayes <phayes@ihmc.us>
Date: Thu, 13 Oct 2011 00:24:09 -0500
To: public-rdf-wg WG <public-rdf-wg@w3.org>
Message-Id: <594C1983-2581-4848-A872-E193B3FC8E6C@ihmc.us>

All this discussion to day has been bubbling around in my head, and I will be absent most of tomorrow, so am putting some thoughts down to see if they survive.

If we are going to conform to the RDF semantics, then IRI names are supposed to be 'global' in scope - mean the same everywhere - and also 'fixed' in meaning - they refer to one thing (though we might not know exactly what that thing is, there is an underlying assumption that each IRI names *something*.) And, these 'naming conditions' on IRIs are not a mere artifact or some tiresome mathematical side-effect of a formal model theory, but are based on fundamental intuitions about the role of IRIs on the Web. The model-theoretic semantics was set up that way because it was supposed to capture these intuitions.

Yet the SPARQL world has been using URIs differently, with multiple meanings, and not only does nothing (apparently) break, this is so handy that everyone is unwilling to give it up.

Right now we have "resolved" this by using word-games to pretend its not happening, so that the use of a URI to name a graph is carefully called 'tagging' or 'association' in explicit contrast to 'naming' or 'reference' or 'denotation'. And there is also 'identifying', which is the TAG usage for 'being connected to by an HTTP GET operation'. So now we have three distinct ways for an IRI to be, um, related to something, all distinct and yet all plausibly could be called 'naming' and indeed would all be called 'naming' if we aren't all very careful to pick our words as carefully as Indian Jones exploring a jungle cave.

There is something badly wrong with this picture, especially as the whole point of the 'denotes' usage in semantics is to be as neutral as possible about exactly what sense of naming one has in mind, not to introduce a special, artificial or "mathematical" sense of naming. So all of these are kinds of denoting, in fact, and the model theory should apply to them all, rather than being walled off from two of them.

All of this difficulty seems to stem from the fact that SPARQL (and/or quad stores and/or RDF datasets) want to use IRIs in two ways at once. Which can be viewed as a form of punning, in fact. Which works fine in OWL2, so why not here?

What makes punning possible is that every token where an IRI occurs oin the OWL syntax, it is straightforward and unambiguous which of the several possible punning interpretations is intended. So although the IRI is 'ambiguous' in meaning, every *occurrence* of every IRI token is unambiguous. If we can retain this token-by-token clarity of meaning, we can allow IRIs to play multiple roles.

So, try this for size. We introduce a notion of a 'context of use' into the RDF concepts/semantics. Every IRI has a unique referent *in a given context of use*. It might have several of them at once, however. A CoU can be defined very broadly and can be user-defined, but it must satisfy some conditions.
1. It MUST be agreed within a community of use in such a way that every participant can determine the conditions defining the CoU.
2. Every CoU MUST specify precise conditions which locally, syntactically determine for every occurrence of every IRI token whether that occurrence is governed by the CoU.
3. No IRI occurrence can be in two CoUs simultaneously.
4. To resolve cases that would violate 3., one CoU can override another, so that any IRI token which satisfies the conditions for both CoUs is assigned to the first and not to the second. This may require agreement between the communities which use each CoU.
5. There is a default CoU, which is the entire Web. All other CoUs override the Web CoU. Any IRI token which is not in a more restricted CoU is in the Web CoU.

A semantic interpretation is defined just as before, but we allow each CoU to fix its own interpretations. Then a single IRI can refer to a person in the triples of a quad store but refer to the graph when used in the fourth quad field. Similarly for the use of an IRI 'associated' with an RDF graph. Both of these are quite crisply defined CoUs. And we can allow RDF which uses the IRI to refer ot the graph, if we add in the object positions of triples whose properties use our graph-describing vocabulary.

This is my first stab at formalizing what seems to have actually happened here. A community of users (all you SPARQL hackers) has put IRIs to a "new" use within a restricted, but not entirely private, set of mechanisms and protocols. And this works just fine exactly because they - um, you - all agree to use these IRIs in these special ways in these particular places which are 'protected' from general Web use. The local change of meaning does not leak out and infect a wider Web use of the IRIs, but is restricted to the particular cases used within this community.

If we go with this idea (which has wider utility, I think) then we don't have to keep getting so anal about 'naming' versus 'association' , which I think is going to be widely seen as very confusing and puzzling.

I realize this is a new idea and only a brief account, and it will need some tightening up, but I think it is worth spending some effort on as it will fix a lot of problems. And it just *seems* right.

Pat

------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32502 (850)291 0667 mobile
phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes

Received on Thursday, 13 October 2011 05:24:41 UTC