- From: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>
- Date: Mon, 16 Sep 2013 22:35:24 +0200
- To: Sandro Hawke <sandro@w3.org>
- Cc: "public-ldp-patch@w3.org" <public-ldp-patch@w3.org>, "Eric Prud'hommeaux" <eric@w3.org>, Tim Berners-Lee <timbl@w3.org>
- Message-ID: <CA+OuRR_abTt_eK7dOGSpVBuWdHvP+kU24bh915=sHyfsT+N3fg@mail.gmail.com>
Hi Sandro, all, unfortunately, I was not with you at the F2F and could not participate in those discussions. I'm still uncomfortable with option 2 (as expressed in [1]), and I think there can be a middle ground between it and the exponential complexity of option 1. In fact I think that we can do with a PATCH language that is not able to express *any* transformation of any graph into any other (but for corner cases, we can still rely on PUT of full SPARQL update...). I implemented an extension of Andy and Rob's proposal [2], musing with my idea that the PATCH language could treat lists as first-class citizens to reduce the need to handle bnodes directly. This language uses a very simple form of WHERE clause, in the form of property paths: not very expressive, but hopefully expressive enough for most cases (bnodes in tree-like structures or lists, and/or with functional or inverse-functional properties). Have a look at the README, or the example in the test directory. What do you think? pa [1] http://lists.w3.org/Archives/Public/public-ldp-patch/2013Sep/0017.html [2] https://github.com/pchampin/rdfpatch On Sun, Sep 15, 2013 at 3:40 AM, Sandro Hawke <sandro@w3.org> wrote: > There have been some good emails on public-ldp-patch, and there was some > good discussion at F2F4. Here's where I think we are. I don't know > of anything in this email that anyone would disagree with (that is, I'm > trying to summarize consensus), and I end with a suggested path forward. > > I think the biggest challenge we face -- and the challenge that divided > me and Eric at the meeting -- is how to patch triples that involve blank > nodes. There seem to be two approaches: > > 1. Require the client to create a graph pattern (a "where clause") > which unambiguously identifies the blank nodes involved in the triples > to be updated, and require the server to use that graph pattern to find > those blank nodes in the graph being patched. > > 2. Require that during the conversation that ends up involving > patching, both parties use the same mapping from blank node labels to > blank nodes. > > Option 1 is a good fit for SPARQL. SPARQL servers naturally do that > graph matching. In contract, standard SPARQL servers don't have any way > to share blank node scope as required for option 2. That kind of > exposure of blank node labels has traditionally been avoided in the > design of RDF systems. > > However, the worst-case performance with option 1 is exponential. If a > triple to be updated is in the middle of a large cloud of blank nodes, > then matching the where-clause might not be possible before we all die > of old age. (It's an extremely well studied problem in computer > science; I'm not an expert, but I think I'm reading the results correctly.) > > No one has offered data about how often this worst-case behavior might > be a problem in practice. Arguably we're still in the early days, so > it's too soon to know how painful this restriction might turn out to be. > > Some people said that the server can just set a time limit and reject > patches that end up taking too long. Other people (me) replied that > makes the overall system too unpredictable, that systems should be able > to send patches with confidence, especially one server to another. As I > said at the meeting, I don't know if this worst-case performance will > turn out to be a problem, but I'm concerned enough about it that I can't > +1 option 1, and don't want my name on a spec based on it. David > reported at the meeting that Google's internal culture generally forbids > using exponential algorithms, so we might expect if they were in the > group they would formally object to option 1 (or just decide to never > use it, which amounts to the same thing). Our anecdotal reports that > they don't use SPARQL support this hearsay, but as long is it remains > hearsay, we probably shouldn't take it too seriously. > > Which brings me to the proposal. > > Let's move forward with both Option 1 *and* Option 2, marking them both > "at risk" in the spec. That gives us the whole Last Call and Candidate > Recommendation periods to gather input on how bad the exponential > performance issue is for Option 1 and how bad the implementation > challenge is for Option 2 (how hard it is to get RDF systems to share > scope in blank node labels). > > Then at the end of CR, we can decide if either of them is good enough to > normatively reference as the basic LDP patch format. If they both end > up implemented and with people liking them, then we just pick one, so > the folks don't have to implement both going forward. If neither of > them is implemented and liked, then we're back to where we are today, > with no standard patch format for LDP, but some more data on why it's hard. > > How's that sound? > > I imagine Option 1 would end up as some subset of SPARQL Update, like > TurtlePatch [1] plus variables or like Eric presented at the meeting. > I imagine for Option 2 we'd have something like Andy and Rob's RDFPatch > [2] or my old GRUF [3] (which I'd forgotten about until reading RDFPatch). > > -- Sandro > > [1] http://www.w3.org/2001/sw/wiki/TurtlePatch > [2] http://afs.github.io/rdf-patch > [3] http://websub.org/wiki/GRUF (from Apr 2010) > > > >
Received on Monday, 16 September 2013 20:35:52 UTC