Re: N3 and N-Triples (was: RDF in HTML: Approaches) from Patrick Stickler on 2002-06-07 (www-rdf-interest@w3.org from June 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Fri, 07 Jun 2002 10:27:36 +0300
To: ext Andy Seaborne <andy.seaborne@hp.com>, "'Graham Klyne'" <GK@ninebynine.org>
CC: RDF Interest <www-rdf-interest@w3.org>
Message-ID: <B9263B18.164A9%patrick.stickler@nokia.com>
On 2002-06-06 15:18, "ext Andy Seaborne" <andy.seaborne@hp.com> wrote:

> 
>> the model theory is quite clear that
>> bnodes are not identified with anything outside the graph in which
> they 
>> appear
> 
> This is the key to me.  My understanding os the scope of the graph is
> limited.
> 
> If I have an in-memory graph, and I write it to a serialized form then
> read it in again (same machine or different machine, same process,
> different process), why do I get a different set of bNodes?  I guess
> this is asking whether the graph-in-the-file is the same
> graph-in-the-memory.  Never did understand this.
> 
> It is a real nuisance when using RDF graph over the network where the
> application is using a graph on a different machine.  They are talking
> about the same graph.  But they can't.  Unless I use language dependent
> RPC!
>
>> if you start introducing identifiers that describe bNodes from
>> "outside", you (a) need to have a way of scoping them to a particular
> graph 
>> instance, or (b) be very sure that they are unique.
> 
> Both (a) and (b) could be done as backwards compatable RDF syntax but it
> is a change to the syntax.  e.g. for (b) a syntax that is "bnode@<uuid>"
> This is not pretending to be a URI - the space of URIs and this space
> are disjoint.  It is just a syntactic labelling of variables for the
> purposes of serialization.
> 

Awhile back I was thinking about this, and what came to my mind,
was this same idea of using UUIDs as a standard identifier for bnodes
in NTriples, which would also work with the rdf:name convention.

For blank nodes that have no specified name, you can automatically
insert it when serializing the graph. Then, if you re-parse that
graph, you get the same nodes.

I was a little uncomfortable this idea on the basis of the fact that
such an approach, while allowing for a kind of round-tripping
of graphs with bnodes out from and back into a system, was that
UUIDs would essentially then be serving the same function as
URIs, and they would no longer be "blank" so to speak.

There is, though an important distinction, namely that because
the UUID "local" names and URIs are syntactically disjunct, it
is possible to strip all non-essential UUIDs in the serialization
if one wanted a "clean" export.

This may warrant further consideration...

Cheers,

Patrick



> "minimal identifying description" (MID)
> 
> Seems fine but lets go the whole way and have the URI for a node as a
> property as a MID :-)
> 
> When processing the RDF I find that strictly I need to handle bNodes
> with isomorphism tests in the absense of the such MIDs.  Labelled nodes
> have an MID called their URI.
> 
> Andy
> 
> -----Original Message-----
> From: Graham Klyne [mailto:GK@NineByNine.org]
> Sent: 6 June 2002 12:04
> To: Seaborne, Andy
> Cc: 'RDF Interest'
> Subject: RE: N3 and N-Triples (was: RDF in HTML: Approaches)
> 
> 
> At 10:51 AM 6/6/02 +0100, Seaborne, Andy wrote:
>> If an RDF processor reads in the same file twice, are the bNodes the
>> same or different?
> 
> I'd say "different".
> 
>> For compatibility with current RDF syntax, implicit bNodes in the
>> current syntax yield different bnodes in the graph created.  But there
>> is a choice as to whether an explicit bNode (one labeled in the syntax)
> 
>> is scoped to the file read operation (and hence creates different
>> bNodes) or whether they get unique labels in the disjoint space.
>> 
>> If RDF is to be exchanged between systems across a newtork using a
>> serialization then the latter is desirable.  It means part of the
>> system (an RDF application) on one machine can talk about the bNodes on
> 
>> another machine (the source of the graph).
> 
> That sounds rather dodgy to me -- the model theory is quite clear that
> bnodes are not identified with anything outside the graph in which they
> appear -- if you start introducing identifiers that describe bNodes from
> 
> "outside", you (a) need to have a way of scoping them to a particular
> graph 
> instance, or (b) be very sure that they are unique.
> 
> Because of the way that bNode semantics are defined (essentially, as
> existential variables), I don't think it really matters if you have
> different bnodes in different places as long as the associated
> statements 
> about them are "isomorphic" -- there's some recent discussion in the
> DAML 
> list about "minimal identifying description" (MID) between Richard Fikes
> 
> and Peter Patel-Schneider that might have some bearing.   I don't know
> where the web archive is, but look for messages starting about:
> 
> [[
> Date: Fri, 24 May 2002 15:39:41 -0700
> From: Richard Fikes <fikes@ksl.stanford.edu>
> To: Joint Committee <joint-committee@daml.org>
> Subject: New DQL Specification
> Content-Type: multipart/mixed;
> boundary="------------C8A05097584B9E8F59A89C7A"
> ]]
> 
> #g
> 
> 
> -------------------
> Graham Klyne
> <GK@NineByNine.org>
> 
> 

--
               
Patrick Stickler              Phone: +358 50 483 9453
Senior Research Scientist     Fax:   +358 7180 35409
Nokia Research Center         Email: patrick.stickler@nokia.com
Received on Friday, 7 June 2002 03:23:33 UTC