Re: Turtle Patch simplification (N3 Patch?) from Reto Gmür on 2014-09-24 (semantic-web@w3.org from September 2014)

From: Reto Gmür <reto@wymiwyg.com>
Date: Wed, 24 Sep 2014 22:26:06 +0200
To: "henry.story@bblfish.net" <henry.story@bblfish.net>
Cc: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>, "public-ldp-comments@w3.org" <public-ldp-comments@w3.org>, public-ldp <public-ldp@w3.org>, Semantic Web <semantic-web@w3.org>
Message-ID: <CALvhUEUXWc1Q6E+RidLFPXD4VJhzgbr29kD4LWvKAj9vji0EPw@mail.gmail.com>
Hi,

I think the wish to "remove triples containing specific blank node" is
misguided and bases on a wrong concept on the semantics of blank nodes.

Now I know there's the never-ending discussion about blank nodes being to
complicated. I'm and advocate of freedom: if you think blank nodes are too
complex, don't use them. use uuid or whatever IRIs you want, then you have
your simple version. You don't have patching difficulties, you have low
computational complexity, can use unix-diff tools on sorted n-triple files
and you can stop reading here.

otherwise.....

_:1 a ex:Cat.

says that there is something that is a cat.

If you come to the conclusion that there is nothing that is a cat, fine:

DELETE
 { ?s a ex:Cat }
WHERE
 { ?a a ex:Cat }

Of course after this also http://fifi.me/ will no longer be of rdf:type
ex:Cat, but that's a consequence of our finding that nothing is a cat.

On the other hand, is we have


_:1 a ex:Cat.
_:1 foaf:name "Fifi".
_:2 a ex:Cat.
_:2 foaf:name "Tiger".

And we found out that Tiger is not a cat but a SpyRobot:

DELETE
 { ?s a ex:Cat }
INSERT
 { ?s a ex:SpyRobot }
WHERE
 { ?a a ex:Cat.
 ?a foaf:name "Tiger". }

After that we have an ex:Cat and an ex:SpyRobot (the ex cat).

For the graph:

_:1 a ex:Cat.
<http://fifi.me> a ex:Cat.
<http://fifi.me> foaf:name "Fifi".

One cannot send a delete/insert query to remove just the statement with the
bnode. But the thing is that the graph is redundant in the first place, so
removing this triple wouldn't change what the graph actually asserts.

There could be more efficient ways to express patches that support blank
nodes. For example they could list RDF Molecules or Minimum Self Contained
Graph that are to be removed and the graphs that should be added. But using
bnode labels misrepresents the purpose of bnodes. Bnode label are a
syntactic tool or a tool used when storing triples . If we focus on the
meaning expressed by a graph and describe the intentions of a patch with
respect to that meaning we see that we are no longer in need to remove
"specific blank nodes".

Cheers,
Reto


On Tue, Sep 23, 2014 at 10:01 AM, henry.story@bblfish.net <
henry.story@bblfish.net> wrote:

>
> On 23 Sep 2014, at 00:40, Pierre-Antoine Champin <
> pierre-antoine.champin@liris.cnrs.fr> wrote:
>
> Hi Henry,
>
> On Mon, Sep 22, 2014 at 4:06 PM, henry.story@bblfish.net <
> henry.story@bblfish.net> wrote:
>
>>
>>  On 22 Sep 2014, at 09:47, Pierre-Antoine Champin <
>> pierre-antoine.champin@liris.cnrs.fr> wrote:
>>
>>  Hi Henry,
>>
>> On Sat, Sep 20, 2014 at 12:57 PM, henry.story@bblfish.net <
>> henry.story@bblfish.net> wrote:
>>
>>> Turtle Patch [1] makes it obvious that being able to name the bnode
>>> URIs in a patch request makes patching exceedingly easy to
>>> implement as well as very easy to create patches. All that is needed for
>>> a client implementation is to know what triples it needs to remove:
>>> there is no need to find a pattern that would identify those triples
>>> among all others. The main problem with Turtle Patch is that it
>>> requires one to then make a new HTTP request with genid blank nodes,
>>> to get the patchable format of the resource, or to always request them
>>> and then sadly turn all bnodes de facto into URIs.
>>>
>>
>>  Agreed, also I'm not sure I'm coming to the same conclusions as you do
>> below...
>>
>>>
>>> Where there is usually a valid concern about naming bnodes - the
>>> point of bnodes is that they the server publishing them should not have
>>> to
>>> maintain references to them add eternam
>>
>>
>>  Who says I should maintain them *ad eternam*??
>>  In my understanding, skolemization does not require the genid URIs to
>> be the same across all successives states (versions) of a given resource.
>> The only have to persist as long as the resource does not change, so that
>> one can safely patch it?
>>
>>
>>  I find genids pretty hackish part of the rdf1.1 spec frankly. Genids
>> are recognised apparently by analysing the schema
>> of the URI, which is pretty much against web architecture.
>> http://www.w3.org/TR/rdf11-concepts/#section-skolemization
>>
>>  So now every RDF linked data client would need to look at each URI to
>> see if it contains a ".wellknown/genid" string to know if it should follow
>> it
>> or not. That's pretty un linked-data-ish. Frankly I am quite surprised it
>> made its way through to the spec. The people supporting it
>> must have made a lot of noise.
>>
>
> [sniped: moved response as an example of an interpretation that
> illustrates a bug in the RDF 1.1 spec:
>
> http://lists.w3.org/Archives/Public/public-rdf-comments/2014Sep/0002.html
> ]
>
>
>>
>>
>>
>>> - in the case of a PATCH
>>> the action is directly on the graph in question, and so there is in this
>>> case no problem of cross reference with other resources.
>>>
>>
>>> Given this I think one should be able to have a simpler version of Turtle
>>> patch without the need for genids, that keep the bnodes local to the
>>> resource and that also don't require the extra request to be made to the
>>> server ( the one that is required to GET the graphs with the genids using
>>> the "Prefer: return=representation blank-nodes=use-genid" header ).
>>>
>>> It could simply be decided that a resource that advertises the given
>>> Patch Format - lets call it N3-Patch - understands there to be an
>>> automatic
>>> mapping from the order of bnodes in their representation to a set
>>> of explicit bnodes such as _:bn1 to _:bnN . One could then have
>>> something like
>>> the following document at /asterix
>>>
>>
>>  Well, in my opinion, we are back to problem #1:
>>  you ask the server to maintain an *order*, while the underlying data
>> model (RDF abstract syntax) has no such notion.
>>  So you still put an extra burden on the server.
>>
>>
>>  No the ordering can be agreed to as part of the protocol. The client or
>> the server would only need to
>> work on the ordering if the patch had blank nodes. If it did then the
>> nodes could be ordered. The server
>> could cache the order.
>>
>
> In my view, it *has* to cache the order (see argument above). I'm not
> saying this is impossible to achieve (I agree with your arguments below as
> to how to do it), I'm only saying that it is not a standard functionality
> of triple stores, so you have to change them (even if slightly). And I do
> not consider that as something that should become a standard feature, since
> this notion of order is not part of the RDF model.
>
>
> It would not need any change to triple stores. You just need a function of
> to create an isomorphic graph
>
>   canonicaliseBnodes : Graph -> Graph
>
> which takes a graph and returns a graph with canonicalised bnodes that is
> isomorphic to the first graph.
>
> If you prefer to use skolemized URIs ( and in my view on the condition
> that you
> write up a bnode: URN standard which takes expiry time of bnodes into
> account )
> then you could have the equivalent
>
>   skolemize: Graph -> Graph
>
> though in this case the graphs are no longer isomorphic under the
> definition of
> http://www.w3.org/TR/rdf11-concepts/#graph-isomorphism
> You need a stronger semantic level notion of isomorphism.
> ( and you know how quesy web2.0 people are at the hearing of the word
> "semantic" )
>
>
>  best
>
>
>> The nice thing is that there is no need for an extra GET to skolemise the
>> nodes,
>> and on the internet it is http connections that are the slowest of all.
>>
>>
>>  Granted, it is not a huge deal, and is probably a good way to keep the
>> extra information that allows you to skolemize bnodes, but this is still
>> something that you get out of the box for standard triple stores.
>>
>>
>>  I think you could calculate it from a graph without needing to change
>> your triple store.
>> One would just need to start from something like
>>   http://www.hpl.hp.com/techreports/2003/HPL-2003-142.pdf
>> where the alogrithm could be followed by just using triples.
>>
>
>>  And in our case we are just interested in an ordering of blank nodes.
>>
>>  There are more recent works on this too.
>>
>>
>>    pa
>>
>>
>>> GET /asterix HTTP/1.1
>>>
>>>
>>> HTTP/1.1 200 Ok
>>> E-Tag: "slab v2"
>>>
>>> [] foaf:name "Asterix".
>>>    foaf:knows [ foaf:name "Julius Caesar";
>>>                 foaf:homePage <http://palace.rome/> ].
>>>
>>> [] foaf:name "Obelix" .
>>>
>>> And patch it with the following PATCH Request
>>>
>>> PATCH /asterix HTTP/1.1
>>> Content-Type: text/n3
>>> If-Match: "slab v2"
>>>
>>> @prefix foaf: <http://xmlns.com/foaf/0.1/>
>>>
>>> { _:bnx1 foaf:knows _:bnx2 } patch:replaceWith { _:bnx1 foaf:knows
>>> _:bnx3 }
>>>
>>> Which would result in a following request
>>>
>>> GET /asterix HTTP/1.1
>>> Accept: text/turtle
>>>
>>> HTTP/1.1 200 Ok
>>> Content-Type: text/turtle
>>>
>>> [] foaf:name "Asterix".
>>>    foaf:knows [ foaf:name "Obelix" ].
>>>
>>> [] foaf:name "Julius Caesar";
>>>    foaf:homePage <http://palace.rome/>
>>>
>>>
>>> Henry
>>>
>>>
>>> [1] http://www.w3.org/2001/sw/wiki/TurtlePatch
>>>
>>>
>>> Social Web Architect
>>> http://bblfish.net/
>>>
>>>
>>>
>>
>>        Social Web Architect
>> http://bblfish.net/
>>
>>
>
> Social Web Architect
> http://bblfish.net/
>
>
Received on Wednesday, 24 September 2014 20:26:33 UTC