- From: Alexandre Bertails <bertails@w3.org>
- Date: Fri, 18 Oct 2013 14:54:42 -0400
- To: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>
- CC: "public-ldp-patch@w3.org" <public-ldp-patch@w3.org>, Linked Data Platform WG <public-ldp-wg@w3.org>
On 10/18/2013 10:05 AM, Alexandre Bertails wrote: > On 10/18/2013 08:13 AM, Pierre-Antoine Champin wrote: >> Alexandre, all, >> >> first thanks for this nice review. >> >> It seems that my own proposal [1] has gone completely unnoticed. >> I'm not taking this personnally, but I would like to hear the feedback >> of the group on this idea, especially the handling of RDF lists, which I >> think are not covered by any of the proposals mentionned by Alexandre. >> > > Effectively, nothing personal, but my apologies for missing yours. I've > created a wiki page with my summary and have already added yours [2]. Note that I've just added TimBL's definition for a nailed graph. It's at http://www.w3.org/2012/ldp/wiki/LDP_PATCH_Proposals#Definition_of_a_nailed_graph_.28TimBL.29 Alexandre. > > Generally, I like this approach. I do like the handling of rdf:list > (despite what the SPARQL crowd says, rdf:list are awesome!) but I > still need to think more about it. It is not a subset of SPARQL Update > but having property paths is already a good way to access some bnodes > without referring to them. I would prefer to avoid relying on > skolemization. The B (Bind) idea looks simple. I believe this can > easily be combined with the simple BGP + nailed nodes approach. > > I see emerging several trends: > > * 2 kind of syntaxes: SPARQL Update subset or RDF PATCH derivatives > > * skolemizing bnodes VS matching bnodes > > * expressive power, ie. how powerful and complex the BGP can be > > * support of specific features, eg. rdf:list > > Maybe we can have polls to make progress? I believe that the very > first question to answer is about skolemization. > > Alexandre. > > [2] > http://www.w3.org/2012/ldp/wiki/LDP_PATCH_Proposals#Pierre-Antoine_Champin.27s_RDF-PATCH > > > > >> pa >> >> [1] >> http://lists.w3.org/Archives/Public/public-ldp-patch/2013Sep/0022.html >> >> >> On Fri, Oct 18, 2013 at 4:57 AM, Alexandre Bertails <bertails@w3.org >> <mailto:bertails@w3.org>> wrote: >> >> Hi guys, >> >> I've been playing with different approaches around LDP PATCH. I >> wanted >> to provide some feedback, both from the user's perspective and from >> the implementer's perspective. >> >> As a user, I'm among the people interested in implementing a >> Decentralized Social Web using a vanilla/generic LDP server, with >> WebID and WebACL. I've mainly focused my own experiments with being >> able to patch WebID profiles. So far in the read-write-web server, >> we've been using full SPARQL but we are interested in a lighter PATCH >> format which would not rely on external libraries. >> >> On 09/14/2013 09:40 PM, Sandro Hawke wrote: >> > There have been some good emails on public-ldp-patch, and there >> was some >> > good discussion at F2F4. Here's where I think we are. I don't >> know >> > of anything in this email that anyone would disagree with (that >> is, I'm >> > trying to summarize consensus), and I end with a suggested path >> forward. >> > >> > I think the biggest challenge we face -- and the challenge that >> divided >> > me and Eric at the meeting -- is how to patch triples that >> involve blank >> > nodes. There seem to be two approaches: >> > >> > 1. Require the client to create a graph pattern (a "where >> clause") >> > which unambiguously identifies the blank nodes involved in the >> triples >> > to be updated, and require the server to use that graph pattern >> to find >> > those blank nodes in the graph being patched. >> > >> > 2. Require that during the conversation that ends up involving >> > patching, both parties use the same mapping from blank node >> labels to >> > blank nodes. >> > >> > Option 1 is a good fit for SPARQL. SPARQL servers naturally >> do that >> > graph matching. In contract, standard SPARQL servers don't have >> any way >> > to share blank node scope as required for option 2. That kind of >> > exposure of blank node labels has traditionally been avoided in >> the >> > design of RDF systems. >> >> General remark: Linked Data (in LDP) is different from general RDF: >> the data lives in "small" HTTP documents, not in "big" RDF store. I >> believe that the problem that SPARQL Update addresses is quite >> different from what we want to achieve with LDP PATCH. Because of >> that, I was against considering a subset of SPARQL Update at first, >> but Eric and my experiments made me change my mind. >> >> > >> > However, the worst-case performance with option 1 is exponential. >> If a >> > triple to be updated is in the middle of a large cloud of blank >> nodes, >> > then matching the where-clause might not be possible before we >> all die >> > of old age. (It's an extremely well studied problem in computer >> > science; I'm not an expert, but I think I'm reading the results >> correctly.) >> >> Following Eric's lead, I've actually started with the BGP approach: >> DELETE + INSERT + WHERE-with-simple-BGP. To address the complexity >> issue, I think we can always add some restriction on the BGP. In >> practice, it depends on the expressive power the people are expecting >> for a PATCH. >> >> > >> > No one has offered data about how often this worst-case behavior >> might >> > be a problem in practice. Arguably we're still in the early >> days, so >> > it's too soon to know how painful this restriction might turn out >> to be. >> > >> > Some people said that the server can just set a time limit and >> reject >> > patches that end up taking too long. Other people (me) >> replied that >> > makes the overall system too unpredictable, that systems should >> be able >> > to send patches with confidence, especially one server to >> another. As I >> > said at the meeting, I don't know if this worst-case >> performance will >> > turn out to be a problem, but I'm concerned enough about it that >> I can't >> > +1 option 1, and don't want my name on a spec based on it. David >> > reported at the meeting that Google's internal culture generally >> forbids >> > using exponential algorithms, so we might expect if they were >> in the >> > group they would formally object to option 1 (or just decide to >> never >> > use it, which amounts to the same thing). Our anecdotal >> reports that >> > they don't use SPARQL support this hearsay, but as long is it >> remains >> > hearsay, we probably shouldn't take it too seriously. >> > >> > Which brings me to the proposal. >> > >> > Let's move forward with both Option 1 *and* Option 2, marking >> them both >> > "at risk" in the spec. That gives us the whole Last Call and >> Candidate >> > Recommendation periods to gather input on how bad the exponential >> > performance issue is for Option 1 and how bad the implementation >> > challenge is for Option 2 (how hard it is to get RDF systems to >> share >> > scope in blank node labels). >> > >> > Then at the end of CR, we can decide if either of them is good >> enough to >> > normatively reference as the basic LDP patch format. If they >> both end >> > up implemented and with people liking them, then we just pick >> one, so >> > the folks don't have to implement both going forward. If >> neither of >> > them is implemented and liked, then we're back to where we are >> today, >> > with no standard patch format for LDP, but some more data on why >> it's hard. >> > >> > How's that sound? >> > >> > I imagine Option 1 would end up as some subset of SPARQL >> Update, like >> > TurtlePatch [1] plus variables or like Eric presented at the >> meeting. I >> > imagine for Option 2 we'd have something like Andy and Rob's >> RDFPatch >> > [2] or my old GRUF [3] (which I'd forgotten about until reading >> RDFPatch). >> >> Here is my reviews on [1] and [2] and some other proposals. Take into >> account that I didn't participate in the conversations on LDP PATCH >> and that there is to knowledge no page gathering all the proposals. >> >> TurtlePatch >> ----------- >> >> Champion: Sandro >> >> Summary: subset of SPARQL Update with INSERT and DELETE clauses. >> >> Example: >> >> [[ >> PREFIX foaf <http://xmlns.com/foaf/0.1/> >> PREFIX s <http://www.w3.org/2000/01/rdf-schema#> >> DELETE DATA { >> <http://www.w3.org/People/Berners-Lee/card#i> foaf:mbox >> <mailto:timbl@w3.org <mailto:timbl@w3.org>> >> } >> INSERT DATA { >> <http://www.w3.org/People/Berners-Lee/card#i> foaf:mbox >> <mailto:timbl@hushmail.com <mailto:timbl@hushmail.com>> >> <http://www.w3.org/People/Berners-Lee/card> s:comment "This >> is my >> general description of myself.\n\nI try to keep data here up to >> date and >> it should be considered authoritative." >> } >> ]] >> >> Pros: >> * can be implemented using full SPARQL implementation >> * easy to implement from scratch (parser + runtime) >> >> Cons: >> * no support for bnodes >> >> Status: >> * I implemented this approach in Banana-RDF >> >> Remark: Sandro talked about "TurtlePatch plus variables" but I'm not >> sure what that means exactly by reading his spec. Until I see a >> solution properly considering bnodes, it will be a -1 for me. >> >> RDF Patch >> --------- >> >> Champion: Andy Seaborne >> >> Summary: diffs for RDF dataset >> >> Example: (A is for Add and D for Delete) >> >> [[ >> A <http://example.org/alice> <http://xmlns.com/foaf/0.1/name> >> "Robert" . >> A <http://example.org/bob> <http://xmlns.com/foaf/0.1/knows> >> <http://example/alice> . >> A <http://example.org/alice> <http://xmlns.com/foaf/0.1/name> >> "Alice" . >> D <http://example.org/bob> <http://xmlns.com/foaf/0.1/name> >> "Robert" . >> A <http://example.org/bob> <http://xmlns.com/foaf/0.1/name> >> "Bob" . >> ]] >> >> Pros: >> * easy to implement from scratch (parser + runtime) >> >> Cons: >> * specified for an RDF dataset, not an LDPR >> * blank nodes are system dependant, so not well specified in the case >> of LDP >> >> Remark: actually pretty much the same than TurtlePatch, but it feels >> like it was written for a different use-case. I don't understand how >> LDP is supposed to communicate stable bnodes label for that solution >> to work. So -1 again for me. >> >> EricP's proposal (sorry, don't have a better name) >> ---------------- >> >> Champion: EricP >> >> Summary: SPARQL subset with DELETE, INSERT and WHERE clause. The >> WHERE >> clause is restricted to a simple BGP with no var-predicates. >> >> Example: >> [[ >> PREFIX foaf: <http://xmlns.com/foaf/0.1/> >> DELETE { ?s foaf:name "Alex" } >> INSERT { ?s foaf:name "Alexandre" } >> WHERE { ?s foaf:name "Alex" } >> ]] >> >> Pros: >> * can be implemented using full SPARQL implementation >> * in practice, still easy enough to implement from scratch (parser + >> runtime) >> * can be used to reach some bnodes >> >> Cons: >> * some queries can be NP-complete (would be good to document one >> example) >> >> Status: >> * I implemented this approach in Banana-RDF >> >> Remark: one could argue that in practice, LDPRs are not supposed >> to be >> crazy big, and that most queries won't end up with the worst-case >> complexity. But I do share the concerns. So it's a +1 for me if the >> group wants to go with it, with some reservations about the >> complexity. >> >> EricP's proposal + pinned nodes >> ------------------------------- >> >> Champion: TimBL >> >> *note: TimBL used another term that "pinned node" but I cannot >> remember it right now :-/ Please someone (Tim?) help me. >> >> Summary: same as previous one, but the BGP returns a single matching >> node. Some additional constraints are put on the query. TimBL wrote >> the algorithm on the whiteboard for me and it made sense. >> >> Pros: >> * can be implemented using full SPARQL implementation >> * in practice, still easy enough to implement from scratch (parser + >> runtime) >> * can be used to reach some bnodes >> * expected not to be NP-complete >> >> Cons: >> * we further restrain the number of bnodes that can be >> matched. Shouldn't be an issue in practice. >> >> Status: >> * I implemented a similar restriction in Banana-RDF when the BGP is a >> tree pattern. >> * TimBL told me that Tabulator already implements that approach. >> >> Remark: it's a refinement of EricP's proposal. The specifics still >> have to be worked on but I like the general idea about constraining >> the BGP. Another +1 for me, but I obviously prefer this one on >> EricP's. >> >> Joe Presbrey's PATCH >> -------------------- >> >> Champion: Joe Presbrey >> >> Summary: format is Turtle. For each triple { s p o }, { s p ANY } is >> deleted and { s p o } is added. His implementation forbids the use of >> blank nodes. >> >> Pros: >> * super easy >> >> Cons: >> * can't have { s p o1; o2 } anymore >> * no bnodes >> >> Status: >> * Joe implemented this approach in one of his projects (I guess >> either >> data.fm <http://data.fm> or ldpy) >> >> Remarks: it's too destructive and doesn't handle bnodes, so -1. >> >> >> >> In summary, it looks like people are ok to consider a subset of >> SPARQL. Also, I believe that we cannot ignore the bnodes out there >> and >> LDP PATCH must provide an acceptable solution for them. I personally >> found EricP's proposal easy to implement, so that's a clear >> candidate. Like others, I'm sensitive to the complexity issue and I >> believe that some additional constraint on the BGP should avoid the >> pitfall, so I'm interested in TimBL's idea from Tabulator. >> >> At that point, I believe that the most compelling proposal is EricP's >> with TimBL's constraint. If the group shows interest, I'd be >> interested in writing a first draft of the spec. >> >> What do you guys think? >> >> Alexandre. >> >> > >> > -- Sandro >> > >> > [1] http://www.w3.org/2001/sw/wiki/TurtlePatch >> > [2] http://afs.github.io/rdf-patch >> > [3] http://websub.org/wiki/GRUF (from Apr 2010) >> > >> > >> > >> > >> >> >> >
Received on Friday, 18 October 2013 18:54:45 UTC