reconsidering: blank nodes as named-graph labels from Sandro Hawke on 2013-05-10 (public-rdf-wg@w3.org from May 2013)

From: Sandro Hawke <sandro@w3.org>
Date: Fri, 10 May 2013 16:49:27 -0400
To: W3C RDF WG <public-rdf-wg@w3.org>
Message-ID: <518D5D57.1090601@w3.org>
When we decided to disallow blank nodes as graph labels in datasets, I 
don't think we considered all of the problems this would cause. I've 
come across one that's serious enough to make me think we should 
probably reverse this decision.   (And of course JSON-LD had another, 
which I confess I never quite understood.)

Details:

One of the deliverables in the LDP Working Group charter is a 
specification for a format to use to with the HTTP PATCH 
(modify-in-place) operation on resources whose state is an RDF graph.   
Something like Talis Changesets or a SPARQL 1.1 UPDATE instruction.

At the last LDP F2F we talked about it and the group was overwhelmingly 
in favor of a dataset-based design.  They're very happy with the idea of 
patches that look something like this:

    prefix ldp: <http://www.w3.org/ns/ndp#>
    # ... application data prefixes ...

    [] a ldp:Patch
        ldp:delete <#d1>;
        ldp:insert <#i1>.

    <#d1> { ... triples to delete ... }
    <#i1> {  ... triples to add ... }


So I've been working out the details for how to do that, and mostly I 
think it'll work great.

One stumbling block, however, is using relative URIs like in that 
example above.  In fact you have to generate a uuid to make a patch, 
because there's no base.  Or if there is a base, it's a base that's 
shared with all the other clients, so to avoid a collision, you need to 
make a uuid.

So it's really:


    prefix ldp: <http://www.w3.org/ns/ndp#>
    # ... application data prefixes ...
    # MUST BE A NEW UUID FOR EACH PATCH
    prefix my: <urn:uuid:bdd0bf66-b9ad-11e2-8a86-00216a3e966a:>

    [] a ldp:Patch
        ldp:delete <my:d1>;
        ldp:insert <my:i1>.

    <my:d1> { ... triples to delete ... }
    <my:i1> {  ... triples to add ... }

I suppose one could live with that, but it's pretty painful, especially 
when we don't make users do that for anything else they're labeling in 
their RDF.  Do we ask people to make a UUID any time they want an n-ary 
relation?    (No.)

(Note that these patches are often sent from things that can't make good 
URIs, like browsers.  It's possible to make a mostly-passable UUID in a 
browser, but it's not pretty.)

It seems so much more elegant to just allow:

    [] a ldp:Patch
        ldp:delete _:d1;
        ldp:insert _:i1.

    _:d1{ ... triples to delete ... }
    _:i1{  ... triples to add ... }

Right...?

And just to go one step too far  ;-)  I'll suggest we should also offer 
some syntactic sugar for this in TriG.    Just like we allow

    a b _:x
    _:x c d

to be written as:   a b [ c d ]

.... we could do the same with named-graph blank nodes and allow the 
patch to be written

    [] a ldp:Patch
        ldp:delete { ... triples to delete ... };
        ldp:insert { ... triples to add ... }.


I know that looks kind of like N3, but it's really just very simple 
syntactic sugar that could go into TriG.

Thinking about why we decided against blank nodes, the main thing I 
believe was the SPARQL spec says that in datasets the labels are IRIs.   
I think it's not a huge problem to live with two different kinds of 
datasets like this.   It would mean some compliant SPARQL systems can 
only handle SPARQL 1.1 datasets, not full RDF Datasets.    People who 
wanted to use blank node graph names in SPARQL 1.1 would have to either 
lobby to get that extension put into their favorite SPARQL system (some 
have it already), or they'd have to make do with Skolemization.   That's 
a bit painful, but the alternative is to require every client who wants 
this functionality (even non-SPARQL LDP ones) to Skolemize or 
psuedo-Skolemize with a UUID; that seems even more painful.

Thoughts?

        -- Sandro
Received on Friday, 10 May 2013 20:49:41 UTC