Re: defn of Named Graph from Pat Hayes on 2013-09-18 (www-archive@w3.org from September 2013)

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 17 Sep 2013 23:57:14 -0700
To: Sandro Hawke <sandro@w3.org>
Cc: Jeremy J Carroll <jjc@syapse.com>, www-archive <www-archive@w3.org>
Message-Id: <EAAF7B90-9D7A-447C-9AAA-D0335FDE0323@ihmc.us>
On Sep 17, 2013, at 2:51 PM, Sandro Hawke wrote:

> Following that epiphany I had at the end of my last email, here's what I'd love to see everyone agree on, more or less:

Yes, but I'd like to expound it slightly differently. But basically, I like this. 

> 
> == Named Graphs
> 
> An "RDF Named Graph" is similar to an "RDF Graph", but different in one important way.    Because RDF Graphs are defined as being mathematical sets of RDF Triples, any two RDF Graphs which happen to contain the same RDF Triples are, by definition, the same thing. This means that statements made about any RDF Graph, such as metadata about provenance and licenses, necessarily apply wherever the same set of RDF Triples occurs.   This is not always the desired intent, and Named Graphs provide an alternative.
> 
> Like an RDF Graph, an RDF Named Graph contains zero or more RDF Triples.  Unlike an RDF Graph, an RDF Named Graph has an identity distinct from those triples.

I dont think that is the best way to describe it. The named graph isn't a different kind of container from a set (buit with the same triples in it). Its more like, a named graph is a concrete thing which exemplifies the same abstract structure as an RDF graph. Or, put another way, the RDF graph is the syntactic structure of the actual object that is the named graph. Just like a parse tree of a sentence written in a book. 

Also, the terminology "named graph" is a bit confusing since these things need not be named, right? Why not call them "graph tokens" or some such? Or "concrete graphs" ? (That is in contrast to "abstract", not in the Portland cement sense.) 

>  That is, two Named Graphs remain distinct and distinguishable entities even if they happen to contain exactly the same RDF Triples.

Right, although again I think "contain" is not quite the best phrasing. 

> The term "Named Graph" has historically caused some confusion, as some people have read the phrase to mean "an RDF Graph which happens to have a name".   This reading is not correct, since RDF Named Graphs are not RDF Graphs at all.  

They are awfully closely related to RDF graphs, though. We get the same looseness when talking about other token/type cases. For example, take the following two sentences. 

There are six letter Ts in this sentence.
The letter T, the 20th letter in the English alphabet, derives from the Phoenician letter Taw, written as a cross. 

Both sentences make perfect sense, but the first one is about the tokens (g-boxes, named graphs, surfaces) while the second is about the type (RDF graph). In normal English we call them both "letter T" without getting confused. If someone were to come along and say that letter Ts in this email were "not letter Ts at all", that would be a rather unhelpful exercise in pedantry. 

>  They might reasonably have been called "Identifiable Graphs", which contrasts them to "RDF Graphs" in the same way that a counterfeit dollar bill is not technically a dollar bill.

Ouch, that is a truly awful analogy. Why not go the whole hog and call then "fake graphs"? That will probably lead to a really quick uptake in usage :-). 

>   As in the dollar bill analogy, RDF Named Graphs and RDF Graphs have a lot in common, but in some circumstances it is critical to distinguish between them.    Other names that have been suggested for Named Graphs include "surfaces" and "g-boxes", but "named graph" has been cemented by its use in the SPARQL syntax.

OK, good. 

> 
> Names Graphs also provide a useful semantics for RDF Datasets.  Some RDF Datasets, hereafter NG Datasets, have this intended meaning: each (_name_, _graph_) pair is a statement that _name_ is a Named Graph which contains exactly the triples in _graph_.   

I prefer the phrasing: ... which represents the RDF graph _graph_. Or 'exemplifies' or 'is a copy of' the RDF graph. 

>  The class rdf:NGDataset is defined for signalling these are the intended Dataset semantics.
> 
> The class rdf:NamedGraph is defined for use in declaring the domain and range of predicates which relate Named Graphs.   For example:
> 
>  <> a rdf:NGDataset
>  GRAPH :g1 { :MtEverest :heightFeet 29002 }
>  GRAPH :g2 { :MtEverest :heightFeet 29029 }
>  :g1 :claimedBy :BritishIndiaSurveyOffice.
>  :g2 :claimedBy :IndiaSurveyOffice.
> 
> Here, the domain of :claimedBy is rdf:NamedGraph, and it might be defined in English as "x :claimedBy y means that all the triples in the Named Graph x are claimed to be true by the social entity y."
> 

I don't like this next paragraph. Mostly because I think that the right way to talk about change is indeed to have a notion like a 'container' or a 'source' which contains the triples, but still the thing that it contains is a named graph rather than a graph. You can't put a set into a container any more than you can put a (Platonic, mathematical) number in an address register. The 

BTW, the 1.1 Concepts LC draft has this paragraph:

"We informally use the term RDF source to refer to a persistent yet mutable source or container of RDF graphs. An RDF source is a resource that may be said to have a state that can change over time. A snapshot of the state can be expressed as an RDF graph. For example, any web document that has an RDF-bearing representation may be considered an RDF source. Like all resources, RDF sources may be named with IRIs and therefore described in other RDF graphs."

so maybe we should follow this and use the term "RDF source" from now on? 

> The greatest differences between RDF Graphs and RDF Named Graphs appear when one considers the possibility of them changing over time.    It is nonsensical to consider an RDF Graph changing over time, just like it makes no sense to talk about the value of some integer, say seven, changing over time.   In contrast, it makes perfect sense to consider Named Graphs changing: at one point in time the identifiable thing that is a certain Named Graph contains some triples and at another point in time it might contain different triples.  As of RDF 1.1, however, the formal specifications for RDF do not provide any specific support for handling changing data.
> 
> ==
> 
> That's simple and clear enough, isn't it?     ( ... he says, clinging to perhaps his last shred of hope. )

Yes, except that the basic idea can be explained better by the letter analogy than by the counterfeit dollar bill analogy, IMO. (Or the Moby Dick analogy, which might be better, because we can intuitively say that a named graph is a single copy of the graph, just as a particular book is a single copy of the actual novel.)

Pat


> 
>       -- Sandro
> 
> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 home
40 South Alcaniz St.            (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile (preferred)
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Wednesday, 18 September 2013 06:57:47 UTC