Re: Blank Nodes Re: Toward easier RDF: a proposal from Wouter Beek on 2018-11-23 (semantic-web@w3.org from November 2018)

From: Wouter Beek <wouter@triply.cc>
Date: Fri, 23 Nov 2018 09:22:18 +0100
To: tpassin@tompassin.net
Cc: SW-forum Web <semantic-web@w3.org>
Message-ID: <CAEh2WcO0zr2=LD=YsW39VS3a_PNfkOK9WC0EYipmt2p6HiC1dw@mail.gmail.com>

Hi,

Blank nodes add significant complexity to the Semantic Web ecosystem.
Some concrete examples:

  - Whenever two or more RDF sources are combined, blank nodes must be
standardized apart.  Since Linked Data is all about combining data
from different sources, this means that most Linked Data operations
become more complex when blank nodes are taken into account.  It does
not help that standardizing apart is computationally expensive.

  - Blank nodes in SPARQL result sets have document scope, and triple
stores enforce result set limits (e.g., 10K rows), which means that
longer result sets cannot be guaranteed to be correct.  E.g., it is
unclear whether or not the subject term in row 10,000 is identical to
the subject term in row 10,001:

    10,000 _:x a foaf:Person.
    --------------------
    10,001 _:x foaf:name "John"

  - Blank nodes make it more difficult to determine whether two RDF
documents are the same, which complicates versioning and caching.

  - Blank nodes make Semantic Web standards -- and implementations
thereof -- more complex.  Parts of standards that are about blank
nodes are often the hardest parts to understand.  E.g., in RDF 1.1
Semantics the distinction between graph merge and graph union, or the
operation of graph leaning would not exist if there would be no blank
nodes.  In SPARQL 1.1 the RDF instance mapping would not be needed.
Etc.

Would it not be possible to keep the benefits of abbreviated N3
notation while at the same time doing away with blank nodes?  E.g., by
automatically introducing well-known IRIs instead.

---
Best regards,
Wouter Beek.

Email: w.g.j.beek@vu.nl
WWW: https://wouterbeek.org
Tel: +31647674624

On Fri, Nov 23, 2018 at 4:22 AM Thomas Passin <tpassin@tompassin.net> wrote:
>
> On 11/22/2018 6:49 PM, David Booth wrote:
> > Uh . . . I don't think that is quite correct.  As I understand, a blank
> > node does *not* represent *a* thing.  Rather, it asserts that there
> > *exists* a thing, as explained in the RDF Semantics:
> > https://www.w3.org/TR/rdf11-mt/#blank-nodes
> > In contrast, an IRI represents *a* thing.  I'm sorry to be pedantic
> > here, but I mention it because it underscores my point: the semantics of
> > blank nodes really *are* subtle -- at least to *average* developers.
>
> Again, blank nodes are exactly analogous to a table with no primary key.
>   You can identify the thing by the union of its properties ... until
> there is another thing with the same set of properties.  Then you would
> need to have another property to distinguish the two, which property you
> might or might not know.  You can't have a foreign key, but you can
> still have a WHERE statement that specifies all the properties that
> could distinguish the data object.
>
> And just as with relational databases, the no-primary-key model can only
> get you so far.  But it can be an easy way to get a data set going...
>
>

Received on Friday, 23 November 2018 08:23:17 UTC