- From: Sandro Hawke <sandro@w3.org>
- Date: Wed, 02 Oct 2013 13:01:40 -0400
- To: Steve Speicher <sspeiche@gmail.com>
- CC: public-ldp-patch@w3.org, Eric Prud'hommeaux <eric@w3.org>, Tim Berners-Lee <timbl@w3.org>, Linked Data Platform WG <public-ldp-wg@w3.org>
- Message-ID: <524C5174.1060806@w3.org>
On 10/02/2013 10:54 AM, Steve Speicher wrote: > Sandro, > > My typical resource graphs and patch scenarios have led me to an > approach [1] somewhat similar to option #1. > My approach [1] is to follow a very simple model such as: > a) here are the triples to remove from the graph (exactly, no > dependency on blank node labels) So your data has no blank nodes, right? > b) here are the triples to add to the graph > This seems to hit near 100% of my cases. To be clear, this has not > been widely deployed so the amount of cases and types of resources is > limited. > > After polling another team that is using the LDP approach, they in > fact don't support PUT for updating resources but PATCH only. In > their model, they reused an existing RDF format and defined some > simple patterns (such as a triple in the patch document that matches > subject and predicate with triples in the graph, remove those matched > triples and replace with new triple). This group doesn't use SPARQL > but stores RDF data natively. This team expressed some concern in > library/tool generated PATCH documents in SPARQL-like format, mostly > founded on complexity of the format and overhead of client libraries, > along with potential errors. > Is their data also free from blank nodes? Thanks. -- Sandro > Just some feedback. > > [1] - http://open-services.net/wiki/core/OSLC-Core-Partial-Update/ > > - Steve Speicher > > > On Sat, Sep 14, 2013 at 9:40 PM, Sandro Hawke <sandro@w3.org > <mailto:sandro@w3.org>> wrote: > > There have been some good emails on public-ldp-patch, and there > was some good discussion at F2F4. Here's where I think we are. > I don't know of anything in this email that anyone would disagree > with (that is, I'm trying to summarize consensus), and I end with > a suggested path forward. > > I think the biggest challenge we face -- and the challenge that > divided me and Eric at the meeting -- is how to patch triples that > involve blank nodes. There seem to be two approaches: > > 1. Require the client to create a graph pattern (a "where > clause") which unambiguously identifies the blank nodes involved > in the triples to be updated, and require the server to use that > graph pattern to find those blank nodes in the graph being patched. > > 2. Require that during the conversation that ends up involving > patching, both parties use the same mapping from blank node labels > to blank nodes. > > Option 1 is a good fit for SPARQL. SPARQL servers naturally do > that graph matching. In contract, standard SPARQL servers don't > have any way to share blank node scope as required for option 2. > That kind of exposure of blank node labels has traditionally been > avoided in the design of RDF systems. > > However, the worst-case performance with option 1 is exponential. > If a triple to be updated is in the middle of a large cloud of > blank nodes, then matching the where-clause might not be possible > before we all die of old age. (It's an extremely well studied > problem in computer science; I'm not an expert, but I think I'm > reading the results correctly.) > > No one has offered data about how often this worst-case behavior > might be a problem in practice. Arguably we're still in the early > days, so it's too soon to know how painful this restriction might > turn out to be. > > Some people said that the server can just set a time limit and > reject patches that end up taking too long. Other people (me) > replied that makes the overall system too unpredictable, that > systems should be able to send patches with confidence, especially > one server to another. As I said at the meeting, I don't know if > this worst-case performance will turn out to be a problem, but I'm > concerned enough about it that I can't +1 option 1, and don't want > my name on a spec based on it. David reported at the meeting that > Google's internal culture generally forbids using exponential > algorithms, so we might expect if they were in the group they > would formally object to option 1 (or just decide to never use it, > which amounts to the same thing). Our anecdotal reports that they > don't use SPARQL support this hearsay, but as long is it remains > hearsay, we probably shouldn't take it too seriously. > > Which brings me to the proposal. > > Let's move forward with both Option 1 *and* Option 2, marking them > both "at risk" in the spec. That gives us the whole Last Call > and Candidate Recommendation periods to gather input on how bad > the exponential performance issue is for Option 1 and how bad the > implementation challenge is for Option 2 (how hard it is to get > RDF systems to share scope in blank node labels). > > Then at the end of CR, we can decide if either of them is good > enough to normatively reference as the basic LDP patch format. > If they both end up implemented and with people liking them, then > we just pick one, so the folks don't have to implement both going > forward. If neither of them is implemented and liked, then > we're back to where we are today, with no standard patch format > for LDP, but some more data on why it's hard. > > How's that sound? > > I imagine Option 1 would end up as some subset of SPARQL Update, > like TurtlePatch [1] plus variables or like Eric presented at the > meeting. I imagine for Option 2 we'd have something like Andy and > Rob's RDFPatch [2] or my old GRUF [3] (which I'd forgotten about > until reading RDFPatch). > > -- Sandro > > [1] http://www.w3.org/2001/sw/wiki/TurtlePatch > [2] http://afs.github.io/rdf-patch > [3] http://websub.org/wiki/GRUF (from Apr 2010) > > > >
Received on Wednesday, 2 October 2013 17:01:48 UTC