Re: Pragmatics of Blank Nodes Re: Toward easier RDF: a proposal from David Booth on 2018-12-04 (semantic-web@w3.org from December 2018)

From: David Booth <david@dbooth.org>
Date: Tue, 4 Dec 2018 15:52:19 -0500
To: semantic-web <semantic-web@w3.org>, Aiden Hogan <aidhog@gmail.com>
Cc: Henry Story <henry.story@bblfish.net>
Message-ID: <33f7d370-84a8-deed-ecb4-8a293357acd6@dbooth.org>

On 12/4/18 3:30 PM, David Booth wrote:
> On 12/3/18 8:29 AM, Henry Story wrote:
>  > . . .  So what are the advantages of blank nodes
>  > pragmatically? They make a description local to the graph
>  > in which they appear and this locality is maintained
>  > across merges. The meaning of URI referenced resources can
>  > be completed by external information of course but the
>  > description ensures that no further links need to be taken
>  > into account when understanding the bnode's meaning. So it
>  > looks like it's ideal for things that need to be entirely
>  > defined by description.
> 
> Interesting point!   That means that blank nodes enjoy a
> form of closed world assumption (CWA), in that there *cannot*
> be any other triples asserted (directly) about a blank node,
> other than the ones already in the document/graph/dataset
> at hand.  (Inference could add some though.)
> 
> Of course, if we are dealing with implicit blank nodes -- the ones 
> generated by [] or () notation in Turtle -- then it's even more obvious 
> that the only property connections to/from that blank node are the ones 
> provided right there.
> 
> This brings me to an interesting question.  To rephrase, the "identity" 
> of a blank node object is determined entirely by the identities of its 
> connected nodes, because it is guaranteed to not have any other 
> connections.  Therefore, a blank node labeling algorithm (or standard 
> Skolemization algorithm) only needs to take into account the subgraph of 
> that blank node's tightly connected neighbors.  By "tightly connected" I 
> mean the subgraph that is connected only through consecutive blank 
> nodes.  (I think this may be slightly different from the Concise Bounded 
> Description (CBD), because the CBD starts only with the *subject* of a 
> triple.)
> https://www.w3.org/Submission/CBD/
> 
> Aiden (or someone else), is this correct?  If so, this would be very 
> beneficial, because the labeling algorithm could then be guaranteed to 
> generate the *same* label (or Skolem URI) for the blank nodes in that 
> subgraph, regardless of any larger graph in which that subgraph appears. 
>   This is very pertinent to n-ary relations, because it means that blank 
> nodes for the same n-ary relation, appearing in different RDF graphs, 
> could be automatically given the *same* label (or Skolem URI) -- even 
> without knowing a key for that object.  Aiden, is this what such 
> canonicalization algorithms already do?

P.S. this would also be very beneficial for the "diff" use case of RDF 
canonicalization, because it would help localize graph labeling differences.

David Booth


> 
> David Booth

Received on Tuesday, 4 December 2018 20:52:41 UTC