Re: rdfs:Graph ? comment on http://www.w3.org/TR/rdf11-concepts/#section-dataset and issue 35 from Jeremy J Carroll on 2013-09-17 (www-archive@w3.org from September 2013)

From: Jeremy J Carroll <jjc@syapse.com>
Date: Tue, 17 Sep 2013 09:41:51 -0700
To: Sandro Hawke <sandro@w3.org>
Cc: Pat Hayes <phayes@ihmc.us>, www-archive <www-archive@w3.org>
Message-Id: <B554BA75-A12E-408F-BF76-86C9F13B050B@syapse.com>
Reading the message below, I think the analogies that work for you are not so good for me.

My analogy was an rdfs:Class as opposed to a mathematical set
Pat's seems to be the ink forming the letter A as opposed to the first letter of the alphabet
Yours seem to be the hard drive containing a file as opposed to the abstract document contained.

I don't think we need get lost in that … it does seem that in terms of intent and code we end up on the same page. The text that we would like to agree on is (pretty obviously) not going to contain either my analogy or your analogy or Pat's analogy - but be much shorter, and be one where you can read it your way, and I can read it my way, and despite retaining significant disagreements we can both write code that agrees with the consensus text and interoperates.

Jeremy J Carroll
Principal Architect
Syapse, Inc.



On Sep 17, 2013, at 6:54 AM, Sandro Hawke <sandro@w3.org> wrote:

> On 09/17/2013 02:33 AM, Pat Hayes wrote:
>> (Aside. If I just hit "reply to all" on these messages, it automatically includes  <public-rdf-comments@w3.org>, even though this is not listed as a recipient.  /Aside)
> 
> (It's not even listed as a CC?   That sounds like a serious mail client bug....)
> 
>> I think I understand what Jeremy is getting at. If I remember correctly, we had very much this discussion back when we were drafting the original "named graphs" paper. Let me have a stab at explaining it.
> 
> And you do a great job, thanks.   I wish I knew where to archive emails like this one (and your one a few days ago to David Booth) which are exactly what I'd want to see when I search for "named graph" (or, in the case of your email to David, "model theory").
> 
> Anyway, building on what you say, let me push back a bit, although I'm largely agreeing with you.
> 
> Some of your analogies are to digital things, not physical things, and those don't actually give us a handle to get out of our messy unwanted entailment.    If we serialize some RDF Graph as the turtle character string "<http://example.com/a> <http://example.com/b> <http://example.com/c>." those 70 characters (or the 70 bytes which represent them in UTF-8) are in a sense more concrete than the graph itself, but they are still abstract enough to have the property we're trying to get away from: anything you say about them relates to anything I say about them, because there is only one "them".
> 
> I think physical metaphors which appeal strongly to our spacial intuition allow us to straighten this out, since humans seem very clear that the same physical object can't be in two places at once. When that turtle string is my hard drive and your hard drive at the same time, that must mean we have two copies (each of which can now have their own properties).     And my CPU has two copies, perhaps, at memory locations 0x0400 and 0x0500.    (In many programming languages, we need to think about this a lot.)     And course the copy in a file /tmp/demo1 is not the same thing as the copy in /tmp/demo2.  The bytes are the same, the characters are the same, the string is the same, but they are in different "files", and the files have different properties.
> 
> Which brings me back to your notion of "surfaces"...    It seems to me the word "surface" has very strong physical connotations, so it brings with it strong intuitions which we can use to reach consensus on certain logical properties.  If we inscribe the same shape (eg the upper case Roman "A") on two different surfaces, clearly there are properties of the shape itself, which are properties of it on every surface, and those are clearly different from the properties of the surfaces themselves.     We also get an interesting third notion: the properties of inscriptions -- the markings of the shape on one surface vs another.    I guess this is what you're getting at in the conclusion of your email.
> 
> So, I love the idea of thinking about each graph-label in a dataset as denoting a surface, and the triples in the graph associated with that label in that dataset as the shapes inscribed on that surface.   That's what you have in mind here, right?
> 
> Further, I think it's a very nice model of the web, to think of URLs as denoting surfaces.   When I do a GET on that URL, the Internet tells me what's written on that surface.
> 
> Are we in agreement on this?
> 
> When I picture surfaces in general, I picture a hard smooth chunk of material, perhaps 1-2 sq ft., maybe pottery.   Usually a fragment of a large sphere.   (Did you draw them curved in your 2009 ISWC keynote?)    When I picture web surfaces, I find they've turned into CRTs (which have the same kind of curve), because I know full well web pages sometimes change in the blink of an eye.    This change doesn't violate any of my deep intuitions about surfaces -- the surface of a CRT is still a surface -- it's just a complicated, sometimes-changing surface.   Similarly, some surfaces have privacy screens so you can only see them from some angles, and some even have complex privacy screens so they look different from different angles (like those toys where the horse appears to run as you change the viewing angle).
> 
> So, my "boxes" and your "surfaces" are very similar.   The difference suggests that perhaps you tend to think of information being written on pages and I think of it being stored in databases.   With boxes, it's more likely there's stuff hiding in the corner you're not going to see until you search for it or "dump" out everything in the box.   And surfaces can naturally be read-only (a page in a book) or read-write (a chalkboard); in contrast, it's a bit of a stretch to imagine a box that always has the same contents (what you called a 'fixed g-box' and I call a 'static g-box').    As a programmer, I'm quite comfortable with the idea that some memory locations are read-only (in fact, the first computer on which I did machine language programming, as a child, had RAM from 0x0000 to 0x9FFF and then ROM from 0xA000 to 0xFFFF, as I recall), but I'll admit that's not the mainstream notion of a "box".
> 
> So, I'll stop here and wait for feedback that we're on the same page about this, before thinking about what to do about it.
> 
>      -- Sandro
> 
> 
> 
>> There are (speaking intuitively) RDF graphs all over the internet, represented using RDF surface syntaxes. RDF/XML documents, Trig documents, quad stores, etc. etc.. But these things are not, strictly speaking, graphs, even if we ignore the fact that they can be modified. Lets assume that they are all cast in stone, so they are not g-boxes. Still, they aren't graphs, because two of them can be different and yet describe the very same graph. So what are they, exactly? They are things that bear the same relation to graphs that a token of the letter "A" bears to the first letter of the English alphabet. Or, they are things that bear the same relation to graphs that actual physical copies of Moby Dick bear to the novel written by Melville. Or, they are things that bear the same relation to graphs that RDF classes bear to the sets that are their extensions. To all intents and purposes, they are just like the the more abstract things, but there can be many of them corresponding to each one of those. They are exemplars, tokens, concretions, intensions, representations, ... choose your favorite analogy... of graphs.
>> 
>> When we wote the named graph paper, we wanted the names to name these things rather than "abstract" graphs, because these are the things that one can store, transmit, copy and generally do processing on. These are the actual RDF data, and the RDF graphs are a kind of abstraction of them, something like the parse tree of a sentence as opposed to a copy of the sentence in an actual document. So we needed a way to define these things, but it wasnt easy to do that in a philosophically elegant way. So we used the quick and slightly dirty construction of pairing the graph with its name to represent the particular thing that the name names. This way, one can have two or more named graphs which are distinct, each has its own name distinct from the other names, but they are all copies of (tokens of, intensions of) the same actual graph. And this simple trick avoids the consequence that we wanted to avoid, and which your example illustrates, which is that if we really were naming the graphs, then my name for my graph would also become a name for your graph if you happened to have a copy of my graph. Which does not sit well with the idea of having deferencable names like URLs.
>> 
>> So, to emphasize, this is not going all the way to g-boxes. The idea was not to give a name to a box which just kind of happens to have a graph in it, or something with a state which can be changed with time. It was to give a name to the actual graph-like things, but with the understanding that we are talking about something more like actual datastructures, or actual documents, than mathematical abstractions. Things that can be put on websites, stored in a file at at an address, given copy protections and pointed at using IRIs. Those are the graph-things that the names of named graphs were supposed to be naming.
>> 
>> Given the g-box discussion, we could identify these things with 'fixed' g-boxes whose state is not allowed to change, but I am less happy with this convention, as the g-box idea introduces the whole business of temporality, state change and so on, which is a huge can of worms that really is not relevant to the "intensional" notion that Jeremy is talking about here. So to introduce all this, then to immediately cancel it by saying the box if 'fixed', is confusing, and conceptual overkill. Personally, I like the letter-A analogy, and would be very happy to have the notion of a token of a graph, being any datastructure or document which encodes or parses to the graph. But not something with a state, not labile or dynamic, just as fixed and eternal as any other RDF notion. And if we do this, then we have a three-way relationship between a name, the graph token it names, and the graph exemplified by the graph token, and we can run your account of datasets without mentioning boxes or implying anything about change and time. Just replace "g-box" with "graph token" (or whatever we decide to call it. It is, of course, a named graph using the conventions from the original paper.) And then your g1/g2 example entailment does not hold, as I think it should not.
>> 
>> Pat
>> 
>> On Sep 16, 2013, at 5:19 PM, Jeremy J Carroll wrote:
>> 
>>> 
>>> 
>>> On Sep 11, 2013, at 8:14 PM, Sandro Hawke <sandro@w3.org> wrote:
>>> 
>>>> On 09/11/2013 06:21 PM, Jeremy J Carroll wrote:
>>>>> This section defines a vocabulary item rdf:Graph in addition to those in [RDF-SCHEMA].
>>>>> This is the class of resources that are RDF graphs. If a resource in this class is identified by an IRI, and that IRI is used to name a graph in a dataset, then within that dataset the resource SHOULD correspond to the named graph.
>>>> Does it not follow from this definition that:
>>>> 
>>>>    PREFIX : <http://example.org/#>
>>>>    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>>    :g1 :p 1.
>>>>    :g1 a rdf:Graph.
>>>>    :g2 a rdf:Graph.
>>>>    GRAPH :g1 { :a :b :c }
>>>>    GRAPH :g2 { :a :b :c }
>>>> entails:
>>>>    :g2 :p 1.
>>>> 
>>>> (assuming the "SHOULD" is taken as something we can count on) ?
>>> 
>>> Hi Sandro
>>> 
>>> this is an excellent question, and one that I takes motivates your discussion of box-model on the WG mailing list.
>>> 
>>> I am not very comfortable with a YES, but, given the text I suggested a YES it would be.
>>> 
>>> In essence I think I want an intensional semantics rather than an extensional semantics, suggested text below; I start with philosophical discussion.
>>> 
>>> In maths, we typically refer to Sets with intensional semantics, in RDF we refer to classes with extensional semantics.
>> You have this exactly backwards, which is rather confusing :-)
>> 
>>> So if I have a class
>>> 
>>> jjc:Friends rdf:type rdfs:Class ;
>>>       rdfs:comment "Jeremy's friends" .
>>> 
>>> and also a class
>>> 
>>> jjc:SandrosFriends rdfs:type rdfs:Class ;
>>>       rdfs:comment "Sandro's friends" .
>>> 
>>> in the unlikely event that we have exactly the same friends, RDF semantics does not confuse the intent.
>> Right. Classes in RDF might have the same members but still be distinct. SO RDF classes are not mathematical sets. But what is your point here? Is this a problem? (Why?)
>> 
>>> A view would be that RDF Semantics achieves this by moving the semantic intent more to the property rdf:type …
>>> 
>>> So, we could scrub the idea of having a class, and instead define a property.
>>> 
>>> An alternative proposed modification, which clarifies my desired NO to your entailment
>>> 
>>> [[
>>> 3.7 The rdf:namesGraph property
>>> 
>>> This section defines a vocabulary item rdf:namesGraph in addition to those in [RDF-SCHEMA].
>>> 
>>> rdf:namesGraph is an instance of rdf:Property that is used to state that a resource is a name for a graph.
>>> 
>>> A triple of the form:
>>> 
>>> R rdf:namesGraph G
>>> 
>>> states that G is an RDF graph and R is a name for the graph G.
>>> If R is an IRI, and that IRI is used to name a graph in a dataset, then within that dataset the resource G SHOULD correspond to the named graph.
>>> 
>>> The rdfs:domain of rdf:namesGraph is rdfs:Resource. No rdfs:range is specified.
>>> ]]
>>> 
>>> 
>>> ===
>>> 
>>> With this my particular use case to add metadata about the graph as an intensional as opposed to an extensional object would be addressed as follows.
>>> 
>>>     PREFIX : <http://example.org/
>>> #>
>>>     PREFIX rdf: <
>>> http://www.w3.org/1999/02/22-rdf-syntax-ns
>>> #>
>>>          GRAPH :g1 { :g1 rdf:namesGraph _:g ; rdfs:comment "An example graph" }
>>>    
>>> 
>>> 
>>> Jeremy J Carroll
>>> Principal Architect
>>> Syapse, Inc.
>>> 
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 home
>> 40 South Alcaniz St.            (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile (preferred)
>> phayes@ihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
>
Received on Tuesday, 17 September 2013 16:42:25 UTC