Blank nodes must DIE! [ was Re: Blank nodes semantics - existential variables?] from David Booth on 2020-06-30 (semantic-web@w3.org from June 2020)

From: David Booth <david@dbooth.org>
Date: Tue, 30 Jun 2020 10:40:56 -0400
To: semantic-web@w3.org
Message-ID: <fd9015d4-accc-e6df-a927-949d5d650ea8@dbooth.org>

On 6/29/20 7:33 PM, Aidan Hogan wrote:
> For what it is worth, we started working on the topic of blank nodes 
> some time ago similarity convinced of the fact that the RDF semantics of 
> blank nodes was unintuitive, and that a better semantics could be found. 
> A couple of papers and several years later, I was/am more or less 
> convinced that the semantics of blank nodes is as it should be in RDF.

While I appreciate the very thorough technical analysis that Aiden has 
done, and I don't exactly disagree with his technical conclusion, after 
years of consideration I've come to look at the problem differently and 
have reached a different conclusion: we should not be dealing with blank 
nodes AT ALL.  Blank nodes should be ELIMINATED from the user 
experience.  We need to move to a higher-level representation that does 
not have blank node labels, so that users never need to think about them 
or be baffled at the semantic subtleties that have dogged these 
discussions for so long.  Blank nodes should exist ONLY in the 
underlying machinery that users NEVER need to touch or see.

In practical terms, this means adopting a new, higher level RDF-based 
syntax that allows RDF tooling to be reused as much as possible.

A minimum contender would be Turtle/TriG without blank node labels, but 
if we are contemplating a new syntax then I personally think it would be 
worth making a few more changes at the same time, to make it even higher 
level and easier to use.  A number of ideas have been collected here, 
though somewhat haphazardly:
https://github.com/w3c/EasierRDF/issues

But note that a new RDF-based syntax is only one part of the entire tool 
chain.  A SPARQL successor would also be needed, to support the new 
features and restrictions, and libraries would have to support them also.

I REALLY wish that some PhD students would take on this challenge: to 
design a higher-level successor to RDF, with a top-line goal of making 
it easy enough for AVERAGE developers (middle 33% of skill), who are new 
to it, to be consistently success.  Note to such PhD students/research: 
pay particular attention to Sean Palmer's insightful comments also:
https://github.com/w3c/EasierRDF/issues/68

IMO blank nodes have been a significant factor in pushing RDF over the 
cognitive complexity threshold that average developers are willing to 
tolerate.  Given how rapidly other easier-to-use graph databases have 
become popular and have far overtaken RDF in market share, I think it is 
URGENT that we address the problem of making RDF easier for AVERAGE 
developers:
https://db-engines.com/en/ranking/graph+dbms

David Booth

Received on Tuesday, 30 June 2020 14:41:09 UTC