Re: Fwd: Re: deterministic naming of blank nodes from Andy Seaborne on 2015-05-21 (public-sparql-dev@w3.org from April to June 2015)

From: Andy Seaborne <andy@apache.org>
Date: Thu, 21 May 2015 09:38:38 +0100
To: public-sparql-dev@w3.org
Message-ID: <555D998E.6040104@apache.org>
Hi David,

Something like can be made to work.  A few things need sorting out ... :-)

I view the the cases of (), and to some extend [], as quite important. 
Dealing with bNodes are pragmatically unavoidable for lists.

"original" labels (if meaning the text the client app writes) doesn't 
quite work - 2 files loaded into the same dataset both using "_:a" need 
to be different bnodes.  The server has to sort that out but it can and 
it has to for () and [] anyway.

I prefer the term "system id" for the internal identifier for the bnode. 
Using "label" for this is part of why it gets confusing. "label in doc 
scope" and "system id".

Every system has one such a concept, though it can be a bit tricky if 
that is a programming language pointer.  This is at the RDF abstract 
syntax level - for update, it's the syntax thats being manipulated - so 
the behaviour of bnodes in entailment is not relevant here.

RDF 1.1 says "the set of possible blank nodes is arbitrary" so each time 
a new one is needed, give it a system id. (AKA it's not infinite.)  Or a 
touch more formally, there is a 1-1 correspondence between some system 
id scheme and bnodes in use.

Now a client app needs to find the system id - it needs this for () and 
[] regardless so it has to be solved - and there needs to be a way to 
get the ref back in again.  It also needs to be able to use it later and 
just using the bNode in SPARQL syntax does not work (SPARQL patterns 
make _:xyz a variable).  And do it without results encoding loosing the 
details (using _:a does not work for that reason).

And you'll want a standards(-ish) compliant way no doubt.

RDF 1.1 skolemization helps.  The SPARQL function URI(...) is undefined 
for a bnode so a legal extension is to return a skolem URI for URI(bnode).

Any node in the graph can be exported with:

(IF(isLiteral(?x), ?x, URI(?x)) AS ?export)

This IRI helps with the last part.  On input,   So spot the special IRIs 
and convert back to internal system id for updates and for pattern matching.

Jena uses <_:...> as the URI scheme for (yes - it's illegal URI syntax; 
RDF 1.1 skolemization is better in that respect) and overloads URI(...) 
for this.

So the change is "original label" to "find it again" but that is needed 
for () and if client A inserts something that client B wants to deal with.

 Andy

On 20/05/15 22:59, David Booth wrote:
> What do people think about this potential approach for supporting
> followup queries and PATCH operations involving blank nodes?
>
> David Booth
>
> -------- Forwarded Message --------
> Subject: Re: deterministic naming of blank nodes
> Date: Wed, 20 May 2015 17:55:30 -0400
> From: David Booth <david@dbooth.org>
> To: Gregg Kellogg <gregg@greggkellogg.net>, henry.story@bblfish.net
> <henry.story@bblfish.net>
> CC: ahogan@dcc.uchile.cl, semantic-web@w3.org
>
> Hi Gregg,
>
> On 05/20/2015 05:07 PM, Gregg Kellogg wrote:
> [ . . . ]
>> Using a PATCH, you can’t reference existing BNodes. You _might_ be
>> able to in an UPDATE, as I indicated. A PATCH is typically described
>> as a series of deletes and adds (depending on the particular format);
>> for most cases SPARQL Update probably covers the use cases better.
>>
>> My interpretation is that the deletes of a PATCH can’t really
>> reference BNodes in any way, unless the entire graph is removed.
>> Updates can, but they’ll always create new BNodes and can’t match
>> against existing ones.
>
> If both client and server are working from the same known base-point,
> and the PATCH is viewed as operating on the canonical *serialization* of
> an RDF graph, such as in N-Triple, then it should work fine, just as
> with any other text file.
>
> However, currently there would be a problem in applying that PATCH to
> RDF that is stored in a SPARQL server, because currently there is no
> standard way to directly refer to a bnode from a separate SPARQL
> operation.  This is a known problem already with SPARQL, which causes
> grief when doing followup queries.  But if SPARQL servers were enhanced
> to (optionally) enable subsequent queries or update operations to refer
> directly to blank nodes by their *original* labels, then both PATCH and
> followup queries would work on SPARQL servers.  (In the case of implicit
> bnodes generated by Turtle/SPARQL [] or () notation the server would
> assign an original label.)  This seems like a good route to take, though
> it means adding that feature to SPARQL.  I'll send this to the SPARQL list
> https://lists.w3.org/Archives/Public/public-sparql-dev/
> to see what others there think.
>
> David Booth
>
>
>
>
Received on Thursday, 21 May 2015 08:39:09 UTC