a rant on blank nodes

I think that our presentation of SPARQL will be simplified
if we recognize that a blank node identifier such as _:x in
a SPARQL query is not a blank node, it is token of a query language.
For example, in most high level languages, I can write 12
and it represents the number twelve.  However, I never believe that the 
characters
12 are twelve.  So why then should I believe that the characters _:x
are a blank node?  Such tokens may represent blank nodes, but
they are not blank nodes.

If then _:x in a query merely represents a blank node, then
which blank node shall it represent?  As language designers,
this is our option to specify.  I think we should say that it
represents a blank node that is distinct from all blank nodes
in the dataset.

With this convention, I think we avoid all the circumlocutions
about renaming blank nodes.  For example, let B be the mapping
of blank node tokens to blank nodes and let S be a mapping
of variables to terms.  Then the current definition of pattern matching
using entailment simplifies to

S is a solution of BGP in graph G
iff
G entails (G union S(B(BGP)))

and there is no worry about colliding blank nodes because by hypothesis
B does not map to any blank node in G. 

Fred

Received on Monday, 13 November 2006 22:43:41 UTC