Re: [GRAPHS] g-box, g-snap, and g-text from Pierre-Antoine Champin on 2011-02-25 (public-rdf-wg@w3.org from February 2011)

From: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>
Date: Fri, 25 Feb 2011 16:21:39 +0100
To: public-rdf-wg <public-rdf-wg@w3.org>
Message-ID: <4D67C903.30705@liris.cnrs.fr>

Some more thoughts on Sandro's g-* terminology, and trying to cut the 
gordian blank node (sorry, I couln't resist to make that pun!...).

For the moment, as I read it, the idea seems to be that a g-box contains 
a g-snap, which can be expressed by a g-text, grapĥically:

   g-box  -->  g-snap  -->  g-text

However, the g-snap is an abstract thing, which has no counterpart in 
the physical or digital world, while g-boxes and g-texts do. Another way 
to look at it is that the value (or state) of g-box is a g-text -- it 
may not be in a serialization format, but it *is* essentially a sequence 
of bytes representing a g-snap. So the links would rather look like

   g-box  -->  g-text  -->  g-snap

The impact is that parsing or serializing become mere *translations* 
from one g-text to another g-text representing an equivalent g-snap. 
Note that I write "equivalent", not "the same", because all those 
processing will *not be required* to preserve blank nodes identity.

They *may* preserve it in some cases, but as they are not required to, 
it would be a mistake to rely on them doing so. It implies that, in 
general, blank nodes should be assumed to be scoped to a given g-text.

This is a change from what has been written until now, where blank nodes 
are scoped to g-boxes. But it better reflects my frustrating experience 
with querying a g-box containing blank nodes, related below.


Consider a g-box, which is not changing, to which you ask several 
queries in a row. For example:

   # QUERY1: retrieve all persons with a mailbox
   SELECT ?p, ?b WHERE { ?s a foaf:Person ; foaf:mbox ?b . }

Now for some of the persons that I retrieved, I would like to get more 
information, by asking

   # QUERY2: retrieve additional information for a person
   SELECT * WHERE { % foaf:name ?n }
   # where '%' above will be replace by one of the nodes retrieved
   # in QUERY1

For all URI nodes I retrieved, the generic query above works fine. But 
for all the blank nodes, it is useless. Worse, it will return all the 
foaf:names of *any* the resources in the store (because a blank node 
matches anything).

This is frustrating when you consider blank nodes to be scoped to the 
store, because you may assume that, as the store has not changed, the 
blank nodes are still valid for subsequent queries.

There are two ways to explain this apparent paradox:

1/ (considering blank nodes are scoped to g-boxes)
When retrieving the SPARQL results, you are parsing them and storing 
them locally in your own g-box, hence changing the blank nodes into 
local ones. That is why you can no longer use them when querying the 
distant store.

2/ (considering blank nodes are scoped to g-texts)
The blank nodes conveyed in the SPARQL results are local to the 
serialization of the result, hence distinct from the internal 
representation of the store.


Both explanations work, so both points of view (g-box scope vs. g-text 
scope) stand to the paradox. However, I think the 2nd one is more natural.

   pa

Received on Friday, 25 February 2011 15:22:08 UTC