Re: [GRAPHS] g-box, g-snap, and g-text from Nathan on 2011-02-25 (public-rdf-wg@w3.org from February 2011)

From: Nathan <nathan@webr3.org>
Date: Fri, 25 Feb 2011 18:37:17 +0000
To: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>
CC: public-rdf-wg <public-rdf-wg@w3.org>
Message-ID: <4D67F6DD.2080207@webr3.org>
Hi Pierre,

Some notes:

Pierre-Antoine Champin wrote:
> Some more thoughts on Sandro's g-* terminology, and trying to cut the 
> gordian blank node (sorry, I couln't resist to make that pun!...).
> 
> For the moment, as I read it, the idea seems to be that a g-box contains 
> a g-snap, which can be expressed by a g-text, grapĥically:
> 
>   g-box  -->  g-snap  -->  g-text
> 
> However, the g-snap is an abstract thing, which has no counterpart in 
> the physical or digital world, while g-boxes and g-texts do. Another way 
> to look at it is that the value (or state) of g-box is a g-text -- it 

the value (or state) of a g-box can be represented by a g-text

> may not be in a serialization format, but it *is* essentially a sequence 
> of bytes representing a g-snap. So the links would rather look like
> 
>   g-box  -->  g-text  -->  g-snap
> 
> The impact is that parsing or serializing become mere *translations* 
> from one g-text to another g-text representing an equivalent g-snap.

Indeed, it's essentially representational state transfer, the state 
(g-snap) of a g-box can be viewed and manipulated via representations of 
the state (g-texts). Although we may map or expose this in various real 
world protocols, we can see that there is an idealized abstract protocol 
at work here, very much inline with the model of a protocol in TimBL's 
Paper Trail design issue [1] "The model is that a protocol P defines a 
status sn as a function of a message m and a previous state sn-1, and 
the time t. sn= P(mn, sn-1, t)" - but that could be digressing a bit!

I'm not sure we can change the order here as you write, but more 
consider the g-box as having one or more managers which we can ask to 
tell us what's in the box, or ask them to add or remove something from 
the box by way of sending them a message (whether that's an http 
message, a sparql query, or some line of code). Hence why I mention 
protocol in the previous paragraph, the state of the g-box is a function 
of the messages we send to it's manager(s).

> Note that I write "equivalent", not "the same", because all those 
> processing will *not be required* to preserve blank nodes identity.
> 
> They *may* preserve it in some cases, but as they are not required to, 
> it would be a mistake to rely on them doing so. It implies that, in 
> general, blank nodes should be assumed to be scoped to a given g-text.

Hmm, blank nodes are definitely in the box, if Sn (current state or 
current g-snap) contains some blank nodes, and we send a message to add 
two new triples, then the two sets of triples are unioned to create a 
new state/g-snap in the box, it's a new state which comprises of the 
same nodes/triples plus two more new triples.

Blank Node Identifier scoping is a tricky one, because they can either 
be scoped to be at g-text level only (such that you can re-assemble or 
duplicate the state of the box when you transfer it) or they can be a 
property of the blank nodes within the box and persist over time, but 
then they wouldn't be "blank" nodes, they'd be named and thus 
universally quantified within the universe of discourse rather than just 
being existentially quantified.

> This is a change from what has been written until now, where blank nodes 
> are scoped to g-boxes. But it better reflects my frustrating experience 
> with querying a g-box containing blank nodes, related below.
> 
> 
> Consider a g-box, which is not changing, to which you ask several 
> queries in a row. For example:
> 
>   # QUERY1: retrieve all persons with a mailbox
>   SELECT ?p, ?b WHERE { ?s a foaf:Person ; foaf:mbox ?b . }
> 
> Now for some of the persons that I retrieved, I would like to get more 
> information, by asking
> 
>   # QUERY2: retrieve additional information for a person
>   SELECT * WHERE { % foaf:name ?n }
>   # where '%' above will be replace by one of the nodes retrieved
>   # in QUERY1
> 
> For all URI nodes I retrieved, the generic query above works fine. But 
> for all the blank nodes, it is useless. Worse, it will return all the 
> foaf:names of *any* the resources in the store (because a blank node 
> matches anything).

Is that a problem? You've asked for the foaf:name of anything which 
exists and has a foaf:name - it seems correct.

> This is frustrating when you consider blank nodes to be scoped to the 
> store, because you may assume that, as the store has not changed, the 
> blank nodes are still valid for subsequent queries.
>
> There are two ways to explain this apparent paradox:
> 
> 1/ (considering blank nodes are scoped to g-boxes)
> When retrieving the SPARQL results, you are parsing them and storing 
> them locally in your own g-box, hence changing the blank nodes into 
> local ones. That is why you can no longer use them when querying the 
> distant store.

If it's a construct then the server is creating a new temporary g-box on 
your behalf, taking a g-snap of it and serializing it in a g-text to 
pass to you, any blank node identifiers are just tools of the g-text 
format which allow you to recompose the state of the g-box locally.

> 2/ (considering blank nodes are scoped to g-texts)
> The blank nodes conveyed in the SPARQL results are local to the 
> serialization of the result, hence distinct from the internal 
> representation of the store.

yes, the Blank Node Identifiers themselves create the paradox when they 
are seen to be names for the blank node (a name for something unnamed), 
in reality the identifier is a property of the g-text, not a property of 
the blank node.

> Both explanations work, so both points of view (g-box scope vs. g-text 
> scope) stand to the paradox. However, I think the 2nd one is more natural.

Likewise re the 2nd one. Quite enjoying these discussions :)

[1] http://www.w3.org/DesignIssues/PaperTrail

Best,

Nathan
Received on Friday, 25 February 2011 18:38:33 UTC