- From: Seaborne, Andy <Andy_Seaborne@hplb.hpl.hp.com>
- Date: Thu, 6 Jun 2002 15:58:02 +0100
- To: "'Graham Klyne'" <GK@NineByNine.org>
- Cc: "'RDF Interest'" <www-rdf-interest@w3.org>
OK - good idea - concrete example. Application A is querying a (local) graph. It is, in some sense, within the graph because it can get nodes, traverse arcs etc etc. Auery is returning statements and resources, including bNodes. The application can use the results from one query or graph access to drive another graph access by having the bNodes flow from the results of one query into the next. Usual sort of RDF API stuff. Now, suppose application A wants to do the same thing with a RDF graph that is on another machine. It is a large remote graph - ideally application A wants to interface to the graph via programmatic access, not read the whole large graph over locally to be used. It would be nice if there were someway to construct an infrastructure that can do this over the web. The information that has to cross the net is such that the application sees the same ability to query, travsers, and manipulate the graph as it did locally. To do this, the on-the-wire form has to contain information to pass parts of the graph over the wire, including round-tripping things like bNodes. Local isomorphism at bNodes of graphs isn't enough, especially when it comes to update. It seems to me that a serialization of an RDF that could capture the graph (shared bNodes and all) would be useful here. MIDs would help. Andy -----Original Message----- From: Graham Klyne [mailto:GK@NineByNine.org] Sent: 6 June 2002 14:52 To: Andy Seaborne Cc: 'RDF Interest' Subject: RE: N3 and N-Triples (was: RDF in HTML: Approaches) At 01:18 PM 6/6/02 +0100, Andy Seaborne wrote: > > the model theory is quite clear that > > bnodes are not identified with anything outside the graph in which >they > > appear > >This is the key to me. My understanding os the scope of the graph is >limited. If you mean that the graph defines a limited scope for a bnode, yes I agree. (I'm not sure what it means to say the scope of a graph.) >If I have an in-memory graph, and I write it to a serialized form then >read it in again (same machine or different machine, same process, >different process), why do I get a different set of bNodes? I guess >this is asking whether the graph-in-the-file is the same >graph-in-the-memory. Never did understand this. I guess that is the question: are they truly the same graph, or are they two graphs that happen (at some time) to be isomorphic? A couple of ways of thinking about this occur to me: (a) model theory: can the different presentations be simultaneously contemplated under different interpretations? If so, I'd suggest they are different. (b) mutation: if one of the presentations is updated, does that update propagate to the other presentation? If not, they are different. >It is a real nuisance when using RDF graph over the network where the >application is using a graph on a different machine. They are talking >about the same graph. But they can't. Unless I use language dependent >RPC! Well, a graph is just syntax - a description of some presumed reality. Do they describe the same reality? Do they mutually entail? That's what ultimately matters, I think. Having different bnodes in different graph instances in no way weakens mutual entailment between two graphs: if the graphs are otherwise identical, the interpretations that satisfy one are exactly the interpretations that satisfy the other. > > if you start introducing identifiers that describe bNodes from > > "outside", you (a) need to have a way of scoping them to a > > particular >graph > > instance, or (b) be very sure that they are unique. > >Both (a) and (b) could be done as backwards compatable RDF syntax but >it is a change to the syntax. e.g. for (b) a syntax that is >"bnode@<uuid>" This is not pretending to be a URI - the space of URIs >and this space are disjoint. It is just a syntactic labelling of >variables for the purposes of serialization. Sure... I wasn't suggesting this be done, just trying to explain why introducing such external identifiers was problematic. > > "minimal identifying description" (MID) > >Seems fine but lets go the whole way and have the URI for a node as a >property as a MID :-) This takes us into the whole Skolemization debate, which others explain far better than I. E.g. discussions of Skolemization in http://www.w3.org/TR/rdf-mt/. Note how carefully the Skolemization lemma has to be stated to be logically valid. >When processing the RDF I find that strictly I need to handle bNodes >with isomorphism tests in the absense of the such MIDs. Labelled nodes >have an MID called their URI. The problem here is that two isomorphic graphs containing bnodes just do not (in general) contain the information that corresponding bnodes denote the same value. Adding URIs to the bnodes in graph imposes such a result. I don't think we're going to make a lot more progress on this debate without being more specific about exactly what it is that you want to achieve. #g -- PS: one possible "out" occurs to me: if graphs themselves are considered resources that can be labelled with URIs (e.g. like formulae in N3), then we could assert that two graph presentations with the same URI were indeed the very same graph. Then, the graphs must be isomorphic, or we have a nonsense (any graph must be isomorphic with itself, right?). And then it is reasonable to say that the corresponding bnodes under graph isomorphism are indeed the same node. The prime difficulty with this that I see is how to account for two graph presentations with the same URI that are not isomorphic: reject it as nonsense (unsatisfiable)? introduce a more subtle account of how presentations relate to the underlying graph (but how then to determine isomorphism)? I think there could be a rathole here. >-----Original Message----- >From: Graham Klyne [mailto:GK@NineByNine.org] >Sent: 6 June 2002 12:04 >To: Seaborne, Andy >Cc: 'RDF Interest' >Subject: RE: N3 and N-Triples (was: RDF in HTML: Approaches) > > >At 10:51 AM 6/6/02 +0100, Seaborne, Andy wrote: > >If an RDF processor reads in the same file twice, are the bNodes the > >same or different? > >I'd say "different". > > >For compatibility with current RDF syntax, implicit bNodes in the > >current syntax yield different bnodes in the graph created. But > >there is a choice as to whether an explicit bNode (one labeled in the > >syntax) > > >is scoped to the file read operation (and hence creates different > >bNodes) or whether they get unique labels in the disjoint space. > > > >If RDF is to be exchanged between systems across a newtork using a > >serialization then the latter is desirable. It means part of the > >system (an RDF application) on one machine can talk about the bNodes > >on > > >another machine (the source of the graph). > >That sounds rather dodgy to me -- the model theory is quite clear that >bnodes are not identified with anything outside the graph in which they >appear -- if you start introducing identifiers that describe bNodes >from > >"outside", you (a) need to have a way of scoping them to a particular >graph instance, or (b) be very sure that they are unique. > >Because of the way that bNode semantics are defined (essentially, as >existential variables), I don't think it really matters if you have >different bnodes in different places as long as the associated >statements about them are "isomorphic" -- there's some recent >discussion in the DAML >list about "minimal identifying description" (MID) between Richard Fikes > >and Peter Patel-Schneider that might have some bearing. I don't know >where the web archive is, but look for messages starting about: > >[[ >Date: Fri, 24 May 2002 15:39:41 -0700 >From: Richard Fikes <fikes@ksl.stanford.edu> >To: Joint Committee <joint-committee@daml.org> >Subject: New DQL Specification >Content-Type: multipart/mixed; > boundary="------------C8A05097584B9E8F59A89C7A" >]] > >#g > > >------------------- >Graham Klyne ><GK@NineByNine.org> ------------------- Graham Klyne <GK@NineByNine.org>
Received on Thursday, 6 June 2002 10:58:17 UTC