RDF as a context logic

The trouble with thinking hard is, you are sometimes obliged to change your mind. After sending that howl from the benches earlier today regarding treating RDF as a context logic and how terrible that idea was[1] , I started to think about it, Antoine's elegant semantic trick, Andy's idea of islands (islands of coherence?) and the whole notion of importation as a way of linking ontologies, and I think I can see a kind of grand plan of how to put all this together. This might be overly ambitious, but I will try to sketch it out here as fully as I can to see how acceptable it might be. It turns out to be very like an old idea of Graham Klynes from twelve (!!) years ago  [2]

The idea is to say, right up front, that RDF is a context logic. That is, *every* RDF graph (triple?) is understood to be asserted 'in' some context or other (and so any RDF that is lying around loose, as it were, is not actually asserted at all, but is just kind of on display or held up for examination.) So these 'contexts' are the way that RDF stuff actually gets asserted. This logic has Antoine's semantics [7], so the truth of a triple is determined by the context it is in. (In case this sounds too revolutionary, read on: it is all back-compatible.) 

What is a context ?

A context is an abstract 'thing' that represents an agreement on the meaning of some set of IRIs, so to assert something in a context is to publicly assemnt to (and agree to be bound by) this agreement about meanings of the IRIs belonging ot the context. (For exactly how this agreement is displayed or recorded, see below.) A context is identified by an IRI, of course. A context is not a graph, in general (although a graph can be used as a context.)

Contexts are individuals which can be put into context classes, have properties, etc.. and there is a special relation between them called 'acceptance' (name?) so that A rdf:accepts B means that context A includes all the vocabulary of B and all the assumptions about it that B makes, and maybe some more. This is very like (at least it is semantically) having a graph A include an owl:imports B statement, but it isn't intended to have the meaning of actually physically copying, just that to agree with A means you must also agree with B. You can treat this as a kind of inheritance (a class inheritance if you think of a context as being the set of names and assumptions made about the vocabulary, but no need to go there) and if you do, then it is multi-inheritance in general: a given context can accept a number of other contexts. So the global picture is of a bunch of contexts related by rdf:accepts forming a DAG rather than a tree. This rdf:accepts is what Cyc, with its inimitable grasp of human readability, calls '#$genlMt' [3], and as I say, it is very analogous to owl:imports, if we think of an OWL ontology as being a context.

The idea would be that one can take a 'more general' context and add extra assumptions to it to give a more specialized context. The very meanings of the URIs can change as this happens, so that 'organism' in a general biological context might have its wide meaning but the same word/URI in a more specialized context further down the hierarchy might have a more specialized menaing, eg in an ontology of insects, say. Cyc has found this to be very useful as it can be used to avoid name-bloat (IRI-bloat for us) where new names have to invented for lots of subcases, or to have distinct names meaning things like person-qua-legal-entitity vs. person-qua-biological-organism. Re-using the general name in a local context saves both on vocabulary size and on reasoning effort. Of course, its not compulsory to do this if you dont like that style. I would guess that once pepole get used to it, builders of complicated ontologies in for example biology and medicine will be using this a *lot*.

Backward compatibility

What about current 2004 contextless RDF? Well, we define a special context called the RDF context. This has the reserved RDF (and RDFS?) vocabulary and the RDF/S specs as its defining assumptions, and all contexts of graphs written in RDF must accept this, so this is the "top" of the RDF context inheritance DAG. Any context anyone creates using RDF *must* rdf:include the RDF context, if the RDF is intended to be used according to the specs, at any rate. (This would kind of officially sanction using RDF in a way that did not conform to the RDF specs by deliberately including something other than the RDF context, but I'm not sure we want to go there.) There could be other contexts not underneath the RDF context, eg there could be a context which makes the http-range-14 convention official and which could be included in combination with the RDF context. Or not. 

So now we can say retrospectively that all 'contextless' RDF is actually asserted in (at least) the RDF context, as a kind of default. We will also want to provide a way to assert that a given graph is actually being asserted in a more particular context, for example by including 

< > rdf:accepts ex:ISO24707 .

in the graph in question, where I'm assuming that there is some ISO context that determines the intended meanings of some IRIs. This re-uses the same property name: if the subject is a graph it means the graph presumes the truth of the context conventions, and when it is a context it means the same thing. rdf:accepts is transitive. (If the object were a graph, it would mean exactly what owl:imports means.) 

Notice that this is exactly like asserting the dataset

{ ex:ISO24707 { <the rest of the graph> }} 

which also asserts the graph in that context. So graphs with this new vocabulary, and datasets, are alternative syntaxes for this new contextual RDF. And we can allow simple graphs to have the < > rdf:accepts rdf:RDFContext . as implicitly included even if not stated explicitly (rather like :x rdf:type owl:Thing is in OWL.)

That < > masks an implict convention (thanks, Sandro) by which the URI of a resource which emits a standard representation of a graph is actually the name of the graph. (Or, of a graph container whose current snapshot is the graph, and which may be rigid.) Personally I am happy with that, but if we aren't, then we certainly need *some* systematic way to refer to graphs. Note that we can't use the same URI both to refer to a graph being asserted in a context and also to the context that the graph is being asserted in (respectively cases 3 and 2 in my earlier email [4] ) so we still have some work to do on how exactly to refer to graphs.

Islands 

An island is now a collection of graphs that all accept a given context (or contexts). Which means they can be used together with confidence based on a common set of assumptions regarding at least the meaning of the IRIs from that context, just as though the 2004 semantics was still working. Obviously there can be looser or tighter islands (archipelagos??) depending on how much of their common vocabulary they all agree on. 

A SPARQL dataset can be regarded as a collection of graphs-in-contexts together with a special un-named graph, whose role is up for grabs. (More later on the default graph.) This fits perfectly with Antoine's semantics, by the way, which would under this proposal be incorporated into the RDF semantics (and would refer to the RDF context idea which would be explained in RDF Concepts.) So this would fall squarely under case 2 in [4], where graph labels are names of contexts, not names for the graphs (but they can still be used as graph labels in a pragmatic case-1-ish way, of course.)

The RDF notion of 'semantic extension' would be absorbed by the more general notion of the 'accepts' relationship between contexts, so that OWL/RDF accepts RDF and OWL-FULL/RDF accepts RDFS, and so on. Accepting a context with this kind of rigorous specification means one agrees to the validity of an entailment regime, but it does not oblige you to actually use all of that regime, only to not use something that violates it. However, people can now also define 'intermediate' contexts such as one that FOAF uses, which accepts the OWL semantics for sameAs and FunctionalProperty without committing to the entire OWL framework. And maybe we can even think up a way to have a context override a higher context, though that makes me nervous.

Defining contexts

None of this really says what a context "actually is", quite deliberately. We want to keep this notion as open as possible. There may be contexts which are simply identified by an IRI without any precise meaning or definition being supplied, where they would play the role simply of a kind of 'agreement marker' signifying the mutual acceptance of a vocabulary being used in common, a purely social role. (For example, imagine an RDF WIKI.) However, it is more likely that there will be a document or documents, perhaps even an ontology, itself perhaps written in RDF and hence being identifiable as an RDF graph, which is authoritative for the context in question. In a case like this, it would seem natural and in the spirit of http-range-14 to say that the context then is an information resource, and indeed to treat the graph or document as being the context, so that we can use the same URI for both of them. This suggests that we have the convention that when a URI context name returns a 200-coded response to a GET with payload that parses to RDF, then acceptance of that context amounts to the same ontological committment as importing that RDF into the accepting graph. (This gives precise flesh to David Booth's notion of "URI declarations" [5] although it would be better to say 'vocabulary declarations'.) So we can use RDF - itself contextual, now - to define RDF contexts; and if all contexts were defined this way exclusively, then the whole picture would amount to little more than adding owl:imports to RDF under another name. But of course they aren't.

Part of this network of interlocking conventions, by the way, is one for naming graphs: you use a URI which resolves to a document which, when processed using the conventions specified by the normative RDF specs, parses to that graph. And then treating this also as a context, as we allowed earlier, can be viewed as a form of deferred reference [6]. Notice that a graph label in a SPARQL dataset is now *not* the name of the graph. (But see next section.)

A context MAY be defined by a document written in English or Chinese, or in some formalism not yet invented, or not by any document at all. Moreover, the defining document NEED NOT be accessible from the context-naming IRI itself. For example, maybe that IRI is linked to a document in some other piece of data somewhere. 

The default graph

What is the un-named graph in a SPARQL dataset supposed to be used for? As it seems to have many uses already, we shouldnt mandate a particular one, but we could say that there is a context for such graphs which, when the graph explicitly accepts it, specifies that the graph has a particular role in its containing dataset, such as (for example) being metadata for the other graphs in which (in this context, uniquely) the graph labels are being used to refer to the graph they label, rather than (in the rest of the dataset) to the context. Which illustrates how a context can change things, and also provides a handy way to add metadata using 'internal' labeling. So this might look like

{ < > rdf:accepts rdf:DatasetMeta .
:g1 rdf:type RDFGraph .
:g2 rdf:type RDFContainer .

:g1{ :bill rdf:type :Human . }
:g2 { :bill rdf:type :Employee .}
}

where everywhere except this default graph, :g2 refers to a time-interval context. 

Two extreme cases

One extreme is that there is only one context for all graphs (presumably the RDF context itself), and all graphs are asserted in this one context. That gives the "URIs are globally unique" vision that I learned with my mother's milk (well, from Tim B-L, in fact). Another extreme is the view that some of y'all were espousing early on in our WG discussions, that every graph is its own context, so that the meaning of a URI depends completely on which graph it is in. No wonder we were talking at cross purposes for a while. I think the truth has to be somewhere in between these, and Andy's "islands" idea was the clue for me. This framework gives a way to cruise between these extremes flexibly. 

I will try to get this into the wiki before this weeks telecon. 

Pat

PS. In case its not obvious, this whole idea replaces my earlier suggestion in [4], and if adopted would render that moot. 
PPS. Maybe this is what Richard and Antoine were suggesting all along? If so, I apologise for my slowness.


[1] http://lists.w3.org/Archives/Public/public-rdf-wg/2012Mar/0084.html
[2] http://www.ninebynine.org/RDFNotes/RDFContexts.html
[3] http://www.cyc.com/cycdoc/vocab/mt-expansion-vocab.html#genlMt
[4] http://lists.w3.org/Archives/Public/public-rdf-wg/2012Feb/0094.html
[5] http://dbooth.org/2007/uri-decl/
[6] http://en.wikipedia.org/wiki/Deferred_reference
[7] http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal#Semantics
------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes

Received on Tuesday, 13 March 2012 04:50:10 UTC