- From: Sandro Hawke <sandro@w3.org>
- Date: Sat, 14 Sep 2013 21:40:23 -0400
- To: public-ldp-patch@w3.org
- CC: Eric Prud'hommeaux <eric@w3.org>, Tim Berners-Lee <timbl@w3.org>, Linked Data Platform WG <public-ldp-wg@w3.org>
There have been some good emails on public-ldp-patch, and there was some good discussion at F2F4. Here's where I think we are. I don't know of anything in this email that anyone would disagree with (that is, I'm trying to summarize consensus), and I end with a suggested path forward. I think the biggest challenge we face -- and the challenge that divided me and Eric at the meeting -- is how to patch triples that involve blank nodes. There seem to be two approaches: 1. Require the client to create a graph pattern (a "where clause") which unambiguously identifies the blank nodes involved in the triples to be updated, and require the server to use that graph pattern to find those blank nodes in the graph being patched. 2. Require that during the conversation that ends up involving patching, both parties use the same mapping from blank node labels to blank nodes. Option 1 is a good fit for SPARQL. SPARQL servers naturally do that graph matching. In contract, standard SPARQL servers don't have any way to share blank node scope as required for option 2. That kind of exposure of blank node labels has traditionally been avoided in the design of RDF systems. However, the worst-case performance with option 1 is exponential. If a triple to be updated is in the middle of a large cloud of blank nodes, then matching the where-clause might not be possible before we all die of old age. (It's an extremely well studied problem in computer science; I'm not an expert, but I think I'm reading the results correctly.) No one has offered data about how often this worst-case behavior might be a problem in practice. Arguably we're still in the early days, so it's too soon to know how painful this restriction might turn out to be. Some people said that the server can just set a time limit and reject patches that end up taking too long. Other people (me) replied that makes the overall system too unpredictable, that systems should be able to send patches with confidence, especially one server to another. As I said at the meeting, I don't know if this worst-case performance will turn out to be a problem, but I'm concerned enough about it that I can't +1 option 1, and don't want my name on a spec based on it. David reported at the meeting that Google's internal culture generally forbids using exponential algorithms, so we might expect if they were in the group they would formally object to option 1 (or just decide to never use it, which amounts to the same thing). Our anecdotal reports that they don't use SPARQL support this hearsay, but as long is it remains hearsay, we probably shouldn't take it too seriously. Which brings me to the proposal. Let's move forward with both Option 1 *and* Option 2, marking them both "at risk" in the spec. That gives us the whole Last Call and Candidate Recommendation periods to gather input on how bad the exponential performance issue is for Option 1 and how bad the implementation challenge is for Option 2 (how hard it is to get RDF systems to share scope in blank node labels). Then at the end of CR, we can decide if either of them is good enough to normatively reference as the basic LDP patch format. If they both end up implemented and with people liking them, then we just pick one, so the folks don't have to implement both going forward. If neither of them is implemented and liked, then we're back to where we are today, with no standard patch format for LDP, but some more data on why it's hard. How's that sound? I imagine Option 1 would end up as some subset of SPARQL Update, like TurtlePatch [1] plus variables or like Eric presented at the meeting. I imagine for Option 2 we'd have something like Andy and Rob's RDFPatch [2] or my old GRUF [3] (which I'd forgotten about until reading RDFPatch). -- Sandro [1] http://www.w3.org/2001/sw/wiki/TurtlePatch [2] http://afs.github.io/rdf-patch [3] http://websub.org/wiki/GRUF (from Apr 2010)
Received on Sunday, 15 September 2013 01:40:38 UTC