Re: [GRAPHS] g-box - abstraction or concrete? from Nathan on 2011-02-27 (public-rdf-wg@w3.org from February 2011)

From: Nathan <nathan@webr3.org>
Date: Sun, 27 Feb 2011 21:47:14 +0000
To: nathan@webr3.org
CC: Ivan Herman <ivan@w3.org>, public-rdf-wg WG <public-rdf-wg@w3.org>, Sandro Hawke <sandro@w3.org>, Pat Hayes <phayes@ihmc.us>, Manu Sporny <msporny@digitalbazaar.com>
Message-ID: <4D6AC662.60808@webr3.org>
top post, purely to save scrolling, for context read quoted mail if you 
haven't already..

actually, if we simply added "quoted graphs" to turtle, then the trig 
use case would be as simple as:

   <uri-a> x:graph { ... } .
   <uri-b> x:graph { ... } .

or you could account for time by doing:

   { ... } x:from "http://..." ; x:retrieved "2011-02-27T21:41:12" .

and so forth (or swap time for some version number), that way blank node 
scoping doesn't need changed, it could remain at snapshot level (quoted 
graph being a snapshot-realization of course).

This seems like the simplest way forward to me, add quoted graphs, 
variables and fix the literals as subjects thing, that should align 
turtle with SPARQL too afaict.. (boo hiss? literals as subjects I won't 
fight over, as one could simply swap media type to n3 and have them..)

Best,

Nathan

Nathan wrote:
> Ivan Herman wrote:
>> On Feb 25, 2011, at 21:48 , Nathan wrote:
>> [snip]
>>> a g-box is a container of statements which form a particular view of 
>>> a subset of the universe of discourse, the container is stateful such 
>>> that it (potentially) contains different statements over time, at any 
>>> one time the statements in the container form a set which can be 
>>> considered the current state of that container (g-snap) and they form 
>>> a current view of a the particular subset of the universe of 
>>> discourse which they describe.
>>>
>>> A g-box is a stateful abstraction whose state is managed by an 
>>> abstract protocol, the abstract protocol is realized via various 
>>> machine protocols which manage the state of the g-box via messages 
>>> and pass full or partial representations of the current state 
>>> (g-snap) in various lexical forms (g-texts).
>>>
>>> A g-box can be given a name, and when a g-box is given a name the 
>>> name becomes a namespace since the g-box is a container, and this 
>>> namespace serves as the scope for all things within the g-box 
>>> (statements/names/nodes). Thus a named-g-box becomes an Aristotelian 
>>> abstraction where the current state of that named-g-box forms a 
>>> particular scoped view of subset of the universe of discourse.
>>>
>>> Since a g-box is an abstraction, it cannot be duplicated or 
>>> replicated (I'm tempted to say a g-box is a Platonic abstraction and 
>>> a named g-box is an Aristotelian abstraction), however two g-box's 
>>> can share the same name(s) and machine protocols can be used to try 
>>> and synchronize the current state of the g-box's sharing the same 
>>> name such that they all offer the same view of the subset of the 
>>> universe of discourse which they describe. This process can be seen 
>>> as forking a g-box at it's current state to create a new g-box with 
>>> the same current-state (g-snap), then pulling/pushing changes to the 
>>> state in order to keep them aligned and sharing the same view / 
>>> saying the same thing.
>>>
>>> make sense?
>>
>>
>> Hm. That is not exactly the way I understood things although it may 
>> not be so far off after all... I just try to extrapolate.
>>
>> Going back a bit to the root of the discussions, ie, Pat's mail:-): he 
>> talks about abstract graphs which, in my mind, is the same as Sandro's 
>> g-snaps. These are mathematical abstractions, ie, sets, which never 
>> exist in the real world. If you want to go back to the Greek world 
>> (and I am not ashamed to say that I may be wrong in my understanding 
>> of the greek philosophers), in my mind a g-snap is an ideal, a 
>> Platonic abstraction. And a g-text is a textual description of a g-snap.
> 
> agree
> 
>> A g-box is a shadow of a g-snap in Plato's cave allegory; a concrete, 
>> tangible representation of a g-snap. Well, it is a little bit smarter 
>> because when poked, it can give you some sort of a representation of a 
>> g-snap (eg, in the form of a g-text). But no, for me a g-box is not an 
>> abstraction, it is a real thing somewhere. Because it is a real thing, 
>> it can have a name, and two g-boxes are different things even if they 
>> represent the same g-snaps.
> 
> agree to some level, although just as a g-snap is a platonic abstraction 
> which has a realization (g-text), so I think a g-box is an abstraction 
> which has a realization (usually in some form of computer memory) - the 
> distinction I make between a g-box and a g-snap is that a g-snap is 
> snapshot of the contents of a g-box, it's state at a particular time, 
> whereas a g-box is a container which spans time and has different 
> contents/states at different times. The key in your text above is "a 
> representation of a g-snap" (not "the representation of the g-snap"), so 
> the relation between g-snap and g-text is 1:N (many representations, 
> g-texts, of the same g-snap) and the relation between a g-box and g-snap 
> is also 1:N over time (one g-snap at a single time, many g-snaps over 
> time). As for poking, well it's the realization of the abstract g-box 
> which can be poked to get a g-text.
> 
> So, I conclude that previously we had one term "RDF Graph" and used it 
> to in specs to refer to both an abstract graph (g-snap) and a 
> realization of it (g-snap), Pat's original issue. Then I brought up that 
> it was also being used to refer to other things which we later 
> established to be g-box's. And now I'm saying that we're doing the same 
> thing with g-box's as we did with RDF Graphs, using the one term to 
> refer to both the abstraction and the realization of it.
> 
>> Trying to use the terminology... if I have something like
>>
>> { <a> <b> <c> }
>>
>> in SPARQL (or n3), what is it? Is it a particular g-text representing 
>> a g-snap? Probably...
>>
>> if I describe a rule (I use N3 syntax here because it is simpler than 
>> RIF would be, but that is just syntax):
>>
>> { ?a <b> <c> } => { <e> <f> ?a }
>>
>> what does it mean in our terminology? Both sides describe a pattern 
>> for a family of g-snaps. Would one say that
>>
>> "If a g-box's g-snap matches the rule's left hand side, then extend 
>> the g-box's g-snap to include the right hand side"? 
> 
> The thing on the left and the thing on the right are what I'd refer to 
> as quoted-graphs (or graph literals) in N3, each one being a g-text (a 
> realization) of a g-snap (an abstract graph). The thing in the middle is 
> of course a predicate/property using shorthand notation given by n3, it 
> represents a logical constant / named node which itself is being used as 
> a relation.
> 
> The full statement itself, is a g-text (a realization) of a g-snap (an 
> abstract graph) which contains only one statement, use case for this 
> particular statement is that it's going to be used as a rule.
> 
> To move forward with your use case in detail, I'm going to use four new 
> terms:
> 
> box - an abstract box which can contain statements, and whose contents 
> can vary over time
> 
> box-realization - a realization of a box, some process coupled to some 
> memory which can manage realizations of the box's state/contents and 
> change the state from one to another, change the contents of the box.
> 
> snapshot - an abstract snapshot of the state/contents of a box at time 
> t, a mathematical set of statements, a g-snap
> 
> snapshot-realization - a realization of a snapshot, a distinct immutable 
> collection of triples in memory, or some lexical representation of them, 
> a g-text
> 
>   Issue 1:
>   Snapshot-realizations are anonymous and there is no way to tell that
>   two snapshot-realizations realize snapshots of the state of the same
>   box, or to tell which state (Sn-1, Sn-5) they are snapshots of.
> 
>   Thus, in order to incorporate the concepts of box or box-realization
>   in to RDF, some form of box identification, and some form of state
>   identification would need to be added.
> 
>   If the two prior needs are not added, then the only notion of boxes
>   that can exist is that of the a realization of the current state of
>   some anonymous box; which is the definition of a snapshot-realization,
>   thus pointless adding.
> 
> Okay, so you've given us a rule (R)
> 
>   { ?a <b> <c> } => { <e> <f> ?a }
> 
> Now, (thanks Ivan) you've given us a rule which has variable identifiers 
> in it, so we better clear up what variables identifiers are too, and 
> blank node identifiers whilst were here (so as not to confuse the two).
> 
> A Blank Node Identifier is temporary reference, bound to a blank node at 
> a particular time - since it's an identifier it belongs in the 
> realization space, and since it's a temporary identifier it belongs in 
> the snapshot space, thus blank node identifiers are scoped to 
> snapshot-realizations. Blank Nodes are therefore only existentially 
> quantified within snapshots.
> 
>   Issue 2:
>   Blank Nodes are only existentially quantified within snapshots, which
>   means they aren't quantified at box level, which means they can't
>   exist at box level.
> 
>   Thus, in order to incorporate the concepts of box or box-realizations
>   in to RDF, the semantics of blank node identifiers and their scope
>   of existential quantification would need to be changed. B.C. break.
> 
> Is it worth continuing this line of thought? boxes clearly do exist in 
> semantic web land (sparql update of "named graphs" for example, and the 
> need for "graph changes over time"), but they don't currently exist in 
> RDF, and the two issues listed above are far from minor. Even if we 
> incorporate boxes, both sparql and the web don't provide for any notion 
> of time, and even if we did work out a way to have the concept of states 
> over time in there, we'd need to change to a temporal logic.
> 
> My take away on this, is that if people want "named graphs" we can only 
> accommodate "named snapshot realizations", which means that if you find 
> at some point two different snapshot-realizations bearing the same name, 
> well frankly you're up the creek without a paddle! We could provide for 
> "quoted graphs" which would allow people to describe what a 
> snapshot-realization is (retrieved from here at date x etc) but then 
> we're moving more towards N3 (a good thing imho).
> 
> Another choice is to formalize what's required for the presence of 
> boxes, such that boxes exist and can be given names, but you can only 
> ever "get" the current state of the box (thus negating the need for 
> mentioning state, state changes or moving to temporal logic), this would 
> make room for other specs to piece together layers of the cake such as 
> some dataset synchronization method, or say adding versioning meta data 
> to http responses in order to cater for this need. The only thing that 
> would need addressed for this would be the scope of blank node 
> quantification.
> 
> Personally, I'd say let's go for adding quoted graphs, variables, add 
> the concept of box but only ever account for the current state, and 
> scope blank node identifiers to being at box level. This would allow for 
> the community to cover all use cases either in or out of RDF and layer 
> on other bits to the sem web stack where needed. Practically this could 
> be quoted graphs added to turtle, and some trig like format which could 
> refer to a named-box and show the snapshot realization of the current 
> state of that box.
Received on Sunday, 27 February 2011 21:48:59 UTC