Re: Pragmatics of Blank Nodes Re: Toward easier RDF: a proposal from David Booth on 2018-12-04 (semantic-web@w3.org from December 2018)

From: David Booth <david@dbooth.org>
Date: Tue, 4 Dec 2018 15:30:53 -0500
To: semantic-web <semantic-web@w3.org>, Aiden Hogan <aidhog@gmail.com>
Cc: Henry Story <henry.story@bblfish.net>
Message-ID: <407257c3-08cd-dd82-966e-925fe4c8a1b7@dbooth.org>

On 12/3/18 8:29 AM, Henry Story wrote:
 > . . .  So what are the advantages of blank nodes
 > pragmatically? They make a description local to the graph
 > in which they appear and this locality is maintained
 > across merges. The meaning of URI referenced resources can
 > be completed by external information of course but the
 > description ensures that no further links need to be taken
 > into account when understanding the bnode's meaning. So it
 > looks like it's ideal for things that need to be entirely
 > defined by description.

Interesting point!   That means that blank nodes enjoy a
form of closed world assumption (CWA), in that there *cannot*
be any other triples asserted (directly) about a blank node,
other than the ones already in the document/graph/dataset
at hand.  (Inference could add some though.)

Of course, if we are dealing with implicit blank nodes -- the ones 
generated by [] or () notation in Turtle -- then it's even more obvious 
that the only property connections to/from that blank node are the ones 
provided right there.

This brings me to an interesting question.  To rephrase, the "identity" 
of a blank node object is determined entirely by the identities of its 
connected nodes, because it is guaranteed to not have any other 
connections.  Therefore, a blank node labeling algorithm (or standard 
Skolemization algorithm) only needs to take into account the subgraph of 
that blank node's tightly connected neighbors.  By "tightly connected" I 
mean the subgraph that is connected only through consecutive blank 
nodes.  (I think this may be slightly different from the Concise Bounded 
Description (CBD), because the CBD starts only with the *subject* of a 
triple.)
https://www.w3.org/Submission/CBD/

Aiden (or someone else), is this correct?  If so, this would be very 
beneficial, because the labeling algorithm could then be guaranteed to 
generate the *same* label (or Skolem URI) for the blank nodes in that 
subgraph, regardless of any larger graph in which that subgraph appears. 
  This is very pertinent to n-ary relations, because it means that blank 
nodes for the same n-ary relation, appearing in different RDF graphs, 
could be automatically given the *same* label (or Skolem URI) -- even 
without knowing a key for that object.  Aiden, is this what such 
canonicalization algorithms already do?

David Booth

Received on Tuesday, 4 December 2018 20:31:16 UTC