Re: reconsidering: blank nodes as named-graph labels from Sandro Hawke on 2013-05-12 (public-rdf-wg@w3.org from May 2013)

From: Sandro Hawke <sandro@w3.org>
Date: Sun, 12 May 2013 13:51:49 -0400
To: Andy Seaborne <andy.seaborne@epimorphics.com>
CC: public-rdf-wg@w3.org
Message-ID: <518FD6B5.6080507@w3.org>
On 05/12/2013 10:23 AM, Andy Seaborne wrote:
> Sandro is exploring what it would take to use a form of TrIg for LDP 
> PATCH.  Blank nodes for graphs is not the last, locking, issue on the 
> design.
>

Indeed.   It's possible LDP-WG wont use this kind of design all. When we 
discussed it, though, Eric argued for a SPARQL-Subset version (like my 
TurtlePatch Proposal [0]) and I did my best to explain your TriG-style 
proposal at the whiteboard, and the recorded vote was  11 in favor of 
going in the direction of your proposal, no objections, no abstentions 
[1].  So I think it's likely the group will use a TriG style approach, 
*if* it ends up doing patch.  It may well decide to leave patch to 
someone else -- there are some tricky details.

Anyway, regardless of what LDP does, in thinking about it, I realized it 
introduced a consideration that we hadn't weighed earlier and hence 
seems to me to be reasonable cause to re-open the issue.   Am I right in 
assuming you'd prefer blank nodes be disallowed as graph names?   Is 
your sense that SPARQL engines should disallow them?   (I understand 
that SPARQL 1.1 neither requires nor disallows them.)   Do you think 
there are SPARQL systems that would never support them, even if users 
started to ask for them, because eg it causes some serious 
implementation problem?
>
> Sandro - I agree that the danger of name clash is real in using TriG. 
> Here is a different take on the problem.
>
> The base target of an update is considered to be a derived resource 
> and not the LDP-R itself.
>
> c.f. POST to an LDP-C where the base is the new resource from the 
> container.
>
> The etag could be included in the notional base -- this even helps the 
> system to refer to the update that cause a change and an update can 
> refer to itself with <>.
>
> e.g. http:://host/resource;etag=ABCD
>

Hm.   I don't think that helps, since there could multiple patches from 
different clients all on the same revision of the resource (and hence 
the same etag).     There's a good chance the server will have to reject 
all but the first, but I wouldn't want there to be confusion among the 
different parts of the patches.

       -- Sandro

[0] http://www.w3.org/2001/sw/wiki/TurtlePatch
[1] http://www.w3.org/2012/ldp/meeting/2013-03-15
>     Andy
>
> On 11/05/13 18:59, Pat Hayes wrote:
>> Well, I can't speak to all the SPARQL complications, but as far as 
>> the semantics are concerned, it is easy to fix. The use of a bnodeID 
>> as a graph label has the semantic specification that that bnode 
>> always denotes the graph it labels, in any satisfying interpretation. 
>> Which means in practice that these bnodes are more like IRIs than 
>> like other bnodes, semantically speaking, except that (of course) 
>> their scope would be local to the dataset. They are Local Resource 
>> Identifiers, LRIs rather than IRIs, if you like.
>>
>> Other comments inline below.
>>
>> Pat
>>
>> On May 11, 2013, at 10:51 AM, Andy Seaborne wrote:
>>
>>> LDP has not committed to use a TriG-ish format.  It's one 
>>> possibility and this particular variant has some issues, raised 
>>> before, that this proposal ignores.
>>>
>>> Why not use a restricted SPARQL update?
>>>
>>> DELETE DATA { .... }
>>> INSERT DATA { .... }
>>>
>>> A big practical advantage is that a mix of DELETE/INSERT can be done 
>>> and has the obvious meaning of applying in order given.
>>>
>>> The TriG design requires that all changes are known before the TriG 
>>> is written (client) or processed (server).  At scale, this c an be a 
>>> large burden.
>>>
>>> A (DELETE|INSERT)* style can be created by recording changes as they 
>>> happen - so it scales both at the client and at the server.  The 
>>> declarative nature of the TriG design is a practical disadvantage here.
>>>
>>> Restricted SPARQL update opens to more of SPARQL if the LDP-WG or an 
>>> LDP-engine so chooses.  Useful ones being CLEAR and the shorthand 
>>> form DELETE WHERE { }.
>>>
>>> On 11/05/13 04:08, Pat Hayes wrote:
>>>> I entirely agree.  I also note that SPARQL engines can surely just
>>>> treat the bnodeIDs as if they were skolemized IRIs, and nothing would
>>>> break. All that matters in either case is the ability to check
>>>> identity of identifiers.
>>>
>>> Won't such systems would have to skolemize all bNodes - a bnode can 
>>> be be used in the graph data as well as be used for a graph label.
>>
>> Yes, but my point is that there is no need to actually generate the 
>> skolem IRIs. (Seriously: why would a processor care whether an 
>> identifier starts with "_" or not? )
>>
>>>
>>> And have to do it all the time because an incoming document could 
>>> have the manifest last (seems really quite sensible after all the 
>>> data is known about):
>>>
>>> -----------------------------------------------------
>>> @prefix ldp: <http://www.w3.org/ns/ndp#> .
>>>
>>> <#i1> {  ... triples to add ... uses _:b0 ... }
>>>
>>> _:b0 { ... triples to delete ... }
>>>
>>> <#i1> {  ... more triples to add ... }
>>>
>>> _:b0 { ... more triples to delete ... but before any inserts ... }
>>>
>>> {
>>> [] a ldp:Patch
>>>    ldp:delete _:b0;
>>>    ldp:insert <#i1>.
>>> }
>>> -----------------------------------------------------
>>>
>>> so it does not know which are deletes and which are inserts until 
>>> the end nor whether it skolemization is necessary so it has to do it 
>>> all the time.
>>
>> I don't really follow this, but I can't see why having bnodes would 
>> be any different than doing the same thing using IRI labels. (Except 
>> of course that there woujld actually be a semantics behind it, 
>> whereas with our current WG decisions, using an IRI inside one graph 
>> and also as a graph label means that the two uses are semantically 
>> unrelated and have nothing to do with one another.)
>>
>>> If there are additional syntactic restrictions on the TriG (e.g. 
>>> exactly two graph blocks, manifest first) then it's not helpful to 
>>> use TriG.
>>>
>>>>> At the last LDP F2F we talked about it and the group was
>>>>> overwhelmingly in favor of a dataset-based design. They're very
>>>>> happy with the idea of patches that look something like this:
>>>>>
>>>>> prefix ldp: <http://www.w3.org/ns/ndp#>
>>>>> # ... application data prefixes ...
>>>>>
>>>>> prefix ldp: <http://www.w3.org/ns/ndp#>
>>>>> # ... application data prefixes ...
>>>>>
>>>>> [] a ldp:Patch
>>>>>     ldp:delete <#d1>;
>>>>>     ldp:insert <#i1>.
>>>
>>> This is not valid TriG.
>>>
>>>>>
>>>>> <#d1> { ... triples to delete ... }
>>>>> <#i1> {  ... triples to add ... }
>>>>>
>>>>> So I've been working out the details for how to do that, and mostly
>>>>> I think it'll work great.
>>>
>>>
>>>>> Thinking about why we decided against blank nodes, the main thing I
>>>>> believe was the SPARQL spec says that in datasets the labels are
>>>>> IRIs.   I think it's not a huge problem to live with two different
>>>>> kinds of datasets like this.
>>>>> It would mean some compliant SPARQL
>>>>> systems can only handle SPARQL 1.1 datasets, not full RDF
>>>>> Datasets.    People who wanted to use blank node graph names in
>>>>> SPARQL 1.1 would have to either lobby to get that extension put into
>>>>> their favorite SPARQL system (some have it already),
>>>
>>> Which ones?
>>>
>>>>> or they'd have
>>>>> to make do with Skolemization.   That's a bit painful, but the
>>>>> alternative is to require every client who wants this functionality
>>>>> (even non-SPARQL LDP ones) to Skolemize or psuedo-Skolemize with a
>>>>> UUID; that seems even more painful.
>>>
>>> As has been pointed out, some systems do specific optimisations 
>>> knowing that a position can only be a URI (Jena does not; 4Store & 
>>> 5Store were mentioned).
>>
>> The actual syntactic difference between IRIs and bnodeIDs seems 
>> almost trivial. How hard would it be to adapt code used for the 
>> former to include both?
>>
>> But in any case, this kind of argument is a dead hand on any changes. 
>> Its bad enough when someone says, our implementation requires that 
>> you don't change X, but when they say, our *optimization* requires 
>> that you don't change X, it time to push back.
>>
>>> You seen to have skipped that bit and other concerns.
>>>
>>> As I recall it, not allowing bNodes also means we don't have to fact 
>>> impossible (future) formal semantics and its that area that means 
>>> the safer, restricted choice.
>>
>> There are no formal semantic problems. Allowing bnodeIDs as graph 
>> labels does not change the current semantics of RDF, it just adds one 
>> new constraint. Its a semantic extension, just like RDFS or 
>> D-entailment.
>>
>> Pat
>>
>>>
>>>     Andy
>>>
>>>
>>
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 or (650)494 3973
>> 40 South Alcaniz St.           (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>
>>
>>
>>
>>
>>
>
>
Received on Sunday, 12 May 2013 17:52:00 UTC