Re: a rant on blank nodes

Fred Zemke wrote:
> 
> I think that our presentation of SPARQL will be simplified
> if we recognize that a blank node identifier such as _:x in
> a SPARQL query is not a blank node, it is token of a query language.

Exactly.  _:x is merely the serialization.  Whenever it is parsed the parse 
has to generate a fresh blank node for all occurences of _:x in a BGP unless 
it knows it is the same as another blank node.  And there is no way of 
knowing that at parsing time.

> For example, in most high level languages, I can write 12
> and it represents the number twelve.  However, I never believe that the 
> characters
> 12 are twelve.  So why then should I believe that the characters _:x
> are a blank node?  Such tokens may represent blank nodes, but
> they are not blank nodes.

12 has a universal nature. Writing variable "x" does not have global scope:

int x = 5 ;

if ( true )
  { int x = 3 ; }

Are they the same "x"? No, they aren't.


> 
> If then _:x in a query merely represents a blank node, then
> which blank node shall it represent?  As language designers,
> this is our option to specify.  I think we should say that it
> represents a blank node that is distinct from all blank nodes
> in the dataset.
> 
> With this convention, I think we avoid all the circumlocutions
> about renaming blank nodes.  For example, let B be the mapping
> of blank node tokens to blank nodes and let S be a mapping
> of variables to terms.  Then the current definition of pattern matching
> using entailment simplifies to
> 
> S is a solution of BGP in graph G
> iff
> G entails (G union S(B(BGP)))
> 
> and there is no worry about colliding blank nodes because by hypothesis
> B does not map to any blank node in G.

True - it's only when the parser "picks" a blank node might it clash.  If we 
assumed the parser mints a fresh one each time, it could never be the same 
any blank node in the graph.

The language we have is a nod to "told blank nodes" where the blank node in 
the query is exactly the same as one in the graph.

Several systems are using <_:abc> for such told blank nodes.

Which looks remarkable like a condition on the skolemization procedure in 
option 3.  Or a condition on the bnode map b() in option 2.

> Fred
> 

	Andy

Received on Tuesday, 14 November 2006 14:54:53 UTC