Re: the state of ldp-patch, and a procedural proposal from Steve Speicher on 2013-10-02 (public-ldp-wg@w3.org from October 2013)

From: Steve Speicher <sspeiche@gmail.com>
Date: Wed, 2 Oct 2013 13:08:21 -0400
To: Sandro Hawke <sandro@w3.org>
Cc: public-ldp-patch@w3.org, "Eric Prud'hommeaux" <eric@w3.org>, Tim Berners-Lee <timbl@w3.org>, Linked Data Platform WG <public-ldp-wg@w3.org>
Message-ID: <CAOUJ7Jqum8wac7CFMPUok24sguQCKR1dQNXMUK5DkbaHc=yz6Q@mail.gmail.com>
On Wed, Oct 2, 2013 at 1:01 PM, Sandro Hawke <sandro@w3.org> wrote:

>  On 10/02/2013 10:54 AM, Steve Speicher wrote:
>
> Sandro,
>
>  My typical resource graphs and patch scenarios have led me to an
> approach [1] somewhat similar to option #1.
> My approach [1] is to follow a very simple model such as:
>   a) here are the triples to remove from the graph (exactly, no dependency
> on blank node labels)
>
>
> So your data has no blank nodes, right?
>

No, it has some blank nodes but its usage is somewhat limited.  Dare I
mention that our resources have some reification statements where we just
key based on the reified statement to find the right triples to modify.


>
>
>    b) here are the triples to add to the graph
> This seems to hit near 100% of my cases.  To be clear, this has not been
> widely deployed so the amount of cases and types of resources is limited.
>
>  After polling another team that is using the LDP approach, they in fact
> don't support PUT for updating resources but PATCH only.  In their model,
> they reused an existing RDF format and defined some simple patterns (such
> as a triple in the patch document that matches subject and predicate with
> triples in the graph, remove those matched triples and replace with new
> triple).  This group doesn't use SPARQL but stores RDF data natively.  This
> team expressed some concern in library/tool generated PATCH documents in
> SPARQL-like format, mostly founded on complexity of the format and overhead
> of client libraries, along with potential errors.
>
>
> Is their data also free from blank nodes?
>

It is not but they feel like it could easily support it.

- Steve Speicher



>
> Thanks.
>
>       -- Sandro
>
>
>  Just some feedback.
>
>  [1] - http://open-services.net/wiki/core/OSLC-Core-Partial-Update/
>
>  - Steve Speicher
>
>
> On Sat, Sep 14, 2013 at 9:40 PM, Sandro Hawke <sandro@w3.org> wrote:
>
>> There have been some good emails on public-ldp-patch, and there was some
>> good discussion at F2F4.   Here's where I think we are.   I don't know of
>> anything in this email that anyone would disagree with (that is, I'm trying
>> to summarize consensus), and I end with a suggested path forward.
>>
>> I think the biggest challenge we face -- and the challenge that divided
>> me and Eric at the meeting -- is how to patch triples that involve blank
>> nodes.   There seem to be two approaches:
>>
>> 1.  Require the client to create a graph pattern (a "where clause") which
>> unambiguously identifies the blank nodes involved in the triples to be
>> updated, and require the server to use that graph pattern to find those
>> blank nodes in the graph being patched.
>>
>> 2.  Require that during the conversation that ends up involving patching,
>> both parties use the same mapping from blank node labels to blank nodes.
>>
>> Option 1 is a good fit for SPARQL.   SPARQL servers naturally do that
>> graph matching.  In contract, standard SPARQL servers don't have any way to
>> share blank node scope as required for option 2. That kind of exposure of
>> blank node labels has traditionally been avoided in the design of RDF
>> systems.
>>
>> However, the worst-case performance with option 1 is exponential. If a
>> triple to be updated is in the middle of a large cloud of blank nodes, then
>> matching the where-clause might not be possible before we all die of old
>> age.  (It's an extremely well studied problem in computer science; I'm not
>> an expert, but I think I'm reading the results correctly.)
>>
>> No one has offered data about how often this worst-case behavior might be
>> a problem in practice.  Arguably we're still in the early days, so it's too
>> soon to know how painful this restriction might turn out to be.
>>
>> Some people said that the server can just set a time limit and reject
>> patches that end up taking too long.   Other people (me) replied that makes
>> the overall system too unpredictable, that systems should be able to send
>> patches with confidence, especially one server to another.  As I said at
>> the meeting, I don't know if this worst-case performance will turn out to
>> be a problem, but I'm concerned enough about it that I can't +1 option 1,
>> and don't want my name on a spec based on it.  David reported at the
>> meeting that Google's internal culture generally forbids using exponential
>> algorithms, so we might expect if they were in the group they would
>> formally object to option 1 (or just decide to never use it, which amounts
>> to the same thing).  Our anecdotal reports that they don't use SPARQL
>> support this hearsay, but as long is it remains hearsay, we probably
>> shouldn't take it too seriously.
>>
>> Which brings me to the proposal.
>>
>> Let's move forward with both Option 1 *and* Option 2, marking them both
>> "at risk" in the spec.   That gives us the whole Last Call and Candidate
>> Recommendation periods to gather input on how bad the exponential
>> performance issue is for Option 1 and how bad the implementation challenge
>> is for Option 2 (how hard it is to get RDF systems to share scope in blank
>> node labels).
>>
>> Then at the end of CR, we can decide if either of them is good enough to
>> normatively reference as the basic LDP patch format.   If they both end up
>> implemented and with people liking them, then we just pick one, so the
>> folks don't have to implement both going forward.    If neither of them is
>> implemented and liked, then we're back to where we are today, with no
>> standard patch format for LDP, but some more data on why it's hard.
>>
>> How's that sound?
>>
>> I imagine Option 1 would end up as some subset of SPARQL Update, like
>> TurtlePatch  [1] plus variables or like Eric presented at the meeting.  I
>> imagine for Option 2 we'd have something like Andy and Rob's RDFPatch [2]
>> or my old GRUF [3] (which I'd forgotten about until reading RDFPatch).
>>
>>     -- Sandro
>>
>> [1]  http://www.w3.org/2001/sw/wiki/TurtlePatch
>> [2]  http://afs.github.io/rdf-patch
>> [3]  http://websub.org/wiki/GRUF (from Apr 2010)
>>
>>
>>
>>
>
>
Received on Wednesday, 2 October 2013 17:08:49 UTC