- From: Andrei Sambra <andrei.sambra@gmail.com>
- Date: Sun, 27 Jul 2014 09:37:31 -0400
- To: Sandro Hawke <sandro@w3.org>
- Cc: Alexandre Bertails <alexandre@bertails.org>, ashok malhotra <ashok.malhotra@oracle.com>, "public-ldp-wg@w3.org" <public-ldp-wg@w3.org>
- Message-ID: <CAFG79eiwYV3S-mGjbHLA1Pd90i-Ea0K8LmqDiBcXuZ_kFV2DaA@mail.gmail.com>
On Sat, Jul 26, 2014 at 11:35 PM, Sandro Hawke <sandro@w3.org> wrote: > On 07/26/2014 10:20 PM, Alexandre Bertails wrote: > >> On Sat, Jul 26, 2014 at 5:59 PM, Sandro Hawke <sandro@w3.org> wrote: >> >>> On 07/26/2014 02:55 PM, Alexandre Bertails wrote: >>> >>>> On Sat, Jul 26, 2014 at 1:52 PM, Sandro Hawke <sandro@w3.org> wrote: >>>> >>>>> On 07/26/2014 01:44 PM, Ashok Malhotra wrote: >>>>> >>>>>> Hi Sandro: >>>>>> Thanks for the pointers. I read some of the mail and the conclusion I >>>>>> came >>>>>> to seems a bit different from what you concluded. I did not see a big >>>>>> push for >>>>>> SPARQL. Instead I found from >>>>>> http://lists.w3.org/Archives/Public/public-rdf-shapes/ >>>>>> 2014Jul/0206.html: >>>>>> >>>>>> "The other possibilities, no matter what the outcome of the workshop, >>>>>> *are* >>>>>> ready to be standardized and I rather suspect some work on combining >>>>>> the >>>>>> best elements of each will get us much further, must faster than >>>>>> trying >>>>>> to >>>>>> mature ShEx." >>>>>> >>>>>> So, this argues for leading with existing solutions, ICV and SPIN, >>>>>> rather >>>>>> than >>>>>> with ShEX because the other solution have some implementation and >>>>>> experience >>>>>> behind them. Makes perfect sense. >>>>>> >>>>>> But the PATCH case seems to be different as AFAIK there are no other >>>>>> existing >>>>>> solutions. >>>>>> >>>>> We can always argue if they are suitable for the problem, but other >>>> existing/potential solutions include: SPARQL Update in full, 2 subsets >>>> of SPARQL Update, and RDF Patch + skolemization. >>>> >>>> Isn't SPARQL UPDATE an existing solution for PATCH? >>>>> >>>>> It serves the basic purpose, although it has some drawbacks, like bad >>>>> worst-case performance and being fairly hard to implement. >>>>> >>>>> Those same things, however, could quite reasonably be said about ICV >>>>> and >>>>> SPIN. >>>>> >>>> I don't know about ICV, SPIN or ShEx (ok, just a little bit, maybe). >>>> >>> >>> To be clear, they are only relevant as another example of how inventing >>> something which could be done by SPARQL (even if painfully) gets a lot of >>> pushback. >>> >> Have you considered that the pushback _could_ be justified? >> >> For example, I really like SPARQL, for several reasons, but as I have >> explained, I really think it is not appropriate as a PATCH format for >> LDP. >> >> >>> I just have two remarks: >>>> >>>> * SPARQL Update as a whole was developed for RDF databases, namely >>>> quad stores, with expressive power from the rest of SPARQL. I don't >>>> know if it was designed with use-cases as in RDF Validation, but I do >>>> know it was not designed for the use-case of updating LDP-RS on the >>>> LDP platform. >>>> * building a technology on top of an existing one is something I tend >>>> to prefer whenever it makes sense. But in our case, we are talking >>>> about taking the subset of an existing language, while remaining >>>> compatible with it. This is *not* as easy as it seems at first. >>>> >>>> I would prefer to hear about concrete proposals on how to do that. As >>>> somebody who _cannot_ rely on an existing SPARQL implementations, and >>>> who is not planning to implement one in full for that use-case, I >>>> would like to see a concrete syntax written down, with a formal >>>> semantics for it. >>>> >>> >>> Okay, I'm going to make two concrete proposals. >>> >>> 1. Just use SPARQL 1.1 Update. The whole thing. I know it doesn't >>> handle lists well. What else is wrong with it? Why can you not use it? >>> >> I became interested in LDP because it was the first time RDF was >> becoming a first-class citizen of the Web, by that I mean applications >> can interact (read/write) *directly* with RDF resources using HTTP, >> without being behind an endpoint. That's what we meant by LDP being >> the intersection of RDF and REST. >> >> The W3C has finally recognized a few years ago that native RDF was not >> the only use-case for RDF applications. You can now have a relational >> database (RDB2RDF), CSV files (RDF for Tabular Data), XML (GRDDL, >> XSLT), etc. But not necessarily a triple/quad store. For example, at >> the company I work for, we have several (ie. physically disconnected) >> Datomic and Cassandra servers, and we are now exposing some of the >> data behind LDP, with the objective of doing for all of our data. In >> all those cases, we want to expose and link our data on the Web, like >> all those so-called RESTful APIs, but in a more consistent way, and >> using RDF as the model and the exchange data format. Hence LDP, and >> not yet-another-web-api. >> >> The reason I am telling you all that is that supporting SPARQL for >> those virtual RDF datasets is not that easy (when possible) when you >> don't have a quadstore as your backend. Reverse mapping for simple >> SPARQL queries is hard. And SPARQL Update is even worse to support. >> Basically, forcing SPARQL Update onto LDP facing applications for >> simple resource updates on single LDP-RS (ie. PATCH) is like using a >> hammer to kill a fly. >> >> So full SPARQL Update is simply a no-go for me. I just cannot support >> it completely, as some features cannot correctly be mapped to Datomic >> and Cassandra. >> > > So this is the key. You want to be able to support PATCH on databases > that are not materialized as either triples OR as SQL. > > If the database was SQL, then (as I understand it), SPARQL Update would be > okay, because it can be mapped to SQL. > > But you don't know how to map SPARQL Update to NoSQL databases, or it's > just too much work. > > I take it you do know how to map LD-Patch to Cassandra and Datomic? > > [ BTW, Datomic sounds awesome. Is it as fun to use as I'd imagine? ] > > > > >> Also, if I was in a case where SPARQL Update was ok for me to use >> (it's not), then I suspect that I wouldn't need LDP at all, and SPARQL >> + Update + Graph Store protocol would just be enough. And there is >> nothing preventing one from using SPARQL Update right now. Just don't >> call it LD Patch. >> > > It's not about what's called what, it's about what we promote as the the > PATCH format. If we had a simple enough PATCH format, then we could > possibly make it a MUST to implement in the next version of LDP. > I think Alexandre makes a valid point. For a spec (LDP) that explicitly tried to avoid SPARQL, using this format for PATCH makes absolutely no sense to me. > > I don't think SPARQL Update is simple enough for that, but my prediction > is the LD-Patch will turn out, sadly, to not be either. > > > > 2. Use SPARQL 1.1 Update with an extension to handle lists well. >>> Specifically, it would be a slice function, usable in FILTER and >>> especially >>> in BIND. This seems like a no-brainer to include in SPARQL 1.2. I'd >>> want >>> to talk to a few of the SPARQL implementers and see if they're up for >>> adding >>> it. Maybe a full set of list functions like [1]. >>> >> Sorry but I don't know RIF and your idea is still very vague for me. I >> understand how you can provide new functions for matching nodes in an >> rdf:list but I fail to see how this plays in a SPARQL Update query. >> >> Can you just provide some examples where you are doing the equivalent >> of that python code (I know read python): >> > > Probably not worthwhile to go into this now, given your veto on SPARQL. > > > > [[ >> >>> l = [1,2,3,4,5,6,7,8,9,10] >>>>> l[2:2] = [11,12] >>>>> l[2:7] = [13,14] >>>>> l[2:] = [15,16] >>>>> l.append(17) >>>>> >>>> ]] >> >> If we want a subset, we could define it purely by restricting the >>> grammar -- >>> eg leaving out the stuff that does federation, negation, aggregation, -- >>> with no need to say anything about the semantics except they are the >>> same as >>> SPARQL. Until I hear what the problem is with SPARQL, though, I don't >>> want >>> to start excluding stuff. >>> >> Am I the only one thinking that "no need to say anything about the >> semantics except they are the same as SPARQL" is just plain wrong? >> >> I mean, would we really tell implementers and users of the technology >> that they have to go learn SPARQL before they can start understanding >> what subset correctly apply to LD Patch? And how? And would they still >> need to carry this ResultSet semantics over while a lot of us would >> explicitly prefer avoiding it? >> > > I think the users who are writing PATCHes by hand will be familiar with > SPARQL. And if they are not, there are lots of other reasons to learn it. > Except that LDP explicitly made a point to avoid SPARQL. Since the LDP model is all about interacting with resources by using their individual URIs, PATCH-ing resources through a SPARQL endpoint goes against the core LDP believes. -- Andrei > > Contrast that with LD-Patch, for which there is no other reason it. > > You seem to think LD-Patch's syntax and semantics are easy. I don't > think they are. Maybe if you expanded the path syntax only many rows it > would be more clear what it means. > > I can't help but regret again that we didn't chose to use TurtlePatch > (which I first wrote on your wall, the week after the workshop - even if I > didn't figure out how to handle bnodes until this year). > https://www.w3.org/2001/sw/wiki/TurtlePatch > > -- Sandro > > > >> Alexandre >> >> -- Sandro >>> >>> >>> [1] http://www.w3.org/TR/rif-dtb/#Functions_and_Predicates_on_RIF_Lists >>> >>> >>> >>> Alexandre >>>> >>>> -- Sandro >>>>> >>>>> >>>>> All the best, Ashok >>>>>> On 7/26/2014 6:10 AM, Sandro Hawke wrote: >>>>>> >>>>>>> On July 25, 2014 2:48:28 PM EDT, Alexandre Bertails >>>>>>> <alexandre@bertails.org> wrote: >>>>>>> >>>>>>>> On Fri, Jul 25, 2014 at 11:51 AM, Ashok Malhotra >>>>>>>> <ashok.malhotra@oracle.com> wrote: >>>>>>>> >>>>>>>>> Alexandre: >>>>>>>>> The W3C held a RDF Validation Workshop last year. >>>>>>>>> One of the questions that immediately came up was >>>>>>>>> "We can use SPARQL to validate RDF". The answer was >>>>>>>>> that SPARQL was to complex and too hard to learn. >>>>>>>>> So, we compromised and the workshop recommended >>>>>>>>> that a new RDF validation language should be developed >>>>>>>>> to cover the simple cases and SPARQL could be used when >>>>>>>>> things got complex. >>>>>>>>> >>>>>>>>> It seems to me that you can make a similar argument >>>>>>>>> for RDF Patch. >>>>>>>>> >>>>>>>> I totally agree with that. >>>>>>>> >>>>>>>> Thanks for bringing this up, Ashok. I'm going to use the same >>>>>>> situation to argue the opposite. >>>>>>> >>>>>>> It's relatively easy for a group of people, especially at a face to >>>>>>> face >>>>>>> meeting, too come to the conclusion SPARQL is too hard to learn and >>>>>>> we >>>>>>> should invent something else. But when we took it to the wider >>>>>>> world, we >>>>>>> got a reaction that's so strong it's hard not to characterize as >>>>>>> violent. >>>>>>> >>>>>>> You might want to read: >>>>>>> >>>>>>> >>>>>>> http://lists.w3.org/Archives/Public/public-rdf-shapes/ >>>>>>> 2014Jul/thread.html >>>>>>> >>>>>>> Probably the most recent ones right now give a decent summary and you >>>>>>> don't have to read them all. >>>>>>> >>>>>>> I have lots of theories to explain the disparity. Like: people who >>>>>>> have >>>>>>> freely chosen to join an expedition are naturally more inclined to go >>>>>>> somewhere interesting. >>>>>>> >>>>>>> I'm not saying we can't invent something new, but be sure to >>>>>>> understand >>>>>>> the battle to get it standardized may be harder than just >>>>>>> implementing >>>>>>> SPARQL everywhere. >>>>>>> >>>>>>> - Sandro >>>>>>> >>>>>>> Alexandre >>>>>>>> >>>>>>>> All the best, Ashok >>>>>>>>> >>>>>>>>> >>>>>>>>> On 7/25/2014 9:34 AM, Alexandre Bertails wrote: >>>>>>>>> >>>>>>>>>> On Fri, Jul 25, 2014 at 8:04 AM, John Arwe <johnarwe@us.ibm.com> >>>>>>>>>> >>>>>>>>> wrote: >>>>>>>> >>>>>>>>> Another problem is the support for rdf:list. I have just finished >>>>>>>>>>>> writing down the semantics for UpdateList and based on that >>>>>>>>>>>> experience, I know this is something I want to rely on as a >>>>>>>>>>>> user, >>>>>>>>>>>> because it is so easy to get it wrong, so I want native support >>>>>>>>>>>> >>>>>>>>>>> for >>>>>>>> >>>>>>>>> it. And I don't think it is possible to do something equivalent in >>>>>>>>>>>> SPARQL Update. That is a huge drawback as list manipulation (eg. >>>>>>>>>>>> >>>>>>>>>>> in >>>>>>>> >>>>>>>>> JSON-LD, or Turtle) is an everyday task. >>>>>>>>>>>> >>>>>>>>>>> Is semantics for UpdateList (that you wrote down) somewhere that >>>>>>>>>>> >>>>>>>>>> WG >>>>>>>> >>>>>>>>> members >>>>>>>>>>> can look at it, and satisfy themselves that they agree with your >>>>>>>>>>> conclusion? >>>>>>>>>>> >>>>>>>>>> You can find the semantics at [1]. Even if still written in Scala >>>>>>>>>> >>>>>>>>> for >>>>>>>> >>>>>>>>> now, this is written in a (purely functional) style, which is very >>>>>>>>>> close to the formalism that will be used for the operational >>>>>>>>>> >>>>>>>>> semantics >>>>>>>> >>>>>>>>> in the spec. Also, note that this is the most complex part of the >>>>>>>>>> entire semantics, all the rest being pretty simple, even Paths. >>>>>>>>>> And >>>>>>>>>> >>>>>>>>> I >>>>>>>> >>>>>>>>> spent a lot of time finding the general solution while breaking it >>>>>>>>>> >>>>>>>>> in >>>>>>>> >>>>>>>>> simpler sub-parts. >>>>>>>>>> >>>>>>>>>> In a nutshell, you have 3 steps: first you move to the left bound, >>>>>>>>>> then you gather triples to delete until the right bound, and you >>>>>>>>>> finally insert the new triples in the middle. It's really tricky >>>>>>>>>> because 1. you want to minimize the number of operations, even if >>>>>>>>>> >>>>>>>>> this >>>>>>>> >>>>>>>>> is only a spec 2. unlike usual linked lists with pointers, you >>>>>>>>>> manipulate triples, so the pointer in question is only the node in >>>>>>>>>> >>>>>>>>> the >>>>>>>> >>>>>>>>> object position in the triple, and you need to remember and carry >>>>>>>>>> >>>>>>>>> the >>>>>>>> >>>>>>>>> corresponding subject-predicate 3. interesting (ie. weird) things >>>>>>>>>> >>>>>>>>> can >>>>>>>> >>>>>>>>> happen at the limits of the list if you don't pay attention. >>>>>>>>>> >>>>>>>>>> [1] >>>>>>>>>> >>>>>>>>>> https://github.com/betehess/banana-rdf/blob/ldpatch/patch/ >>>>>>>> src/main/scala/Semantics.scala#L62 >>>>>>>> >>>>>>>>> I'm not steeped enough in the intracacies of SPARQL Update to have >>>>>>>>>>> >>>>>>>>>> a >>>>>>>> >>>>>>>>> horse >>>>>>>>>>> in this race, but if this issue is the big-animal difference then >>>>>>>>>>> >>>>>>>>>> people >>>>>>>> >>>>>>>>> with the necessary understanding are going to want to see the >>>>>>>>>>> >>>>>>>>>> details. >>>>>>>> >>>>>>>>> The >>>>>>>>>>> IBM products I'm aware of eschew rdf:List (and blank nodes >>>>>>>>>>> >>>>>>>>>> generally, to >>>>>>>> >>>>>>>>> first order), so I don't know how much this one alone would sway >>>>>>>>>>> >>>>>>>>>> me. >>>>>>>> >>>>>>>>> You _could_ generate a SPARQL Update query that would do something >>>>>>>>>> equivalent. But you'd have to match and remember the intermediate >>>>>>>>>> nodes/triples. >>>>>>>>>> >>>>>>>>>> JSON-LD users manipulate lists on a day-to-day basis. Without >>>>>>>>>> native >>>>>>>>>> support for rdf:list in LD Patch, I would turn to JSON PATCH to >>>>>>>>>> manipulate those lists. >>>>>>>>>> >>>>>>>>>> It sounds like the other big-animal difference in your email is >>>>>>>>>>> >>>>>>>>>>> we would have to refine the SPARQL semantics so that the order >>>>>>>>>>>> of >>>>>>>>>>>> >>>>>>>>>>> the >>>>>>>> >>>>>>>>> clauses matters (ie. no need to depend on a query optimiser). And >>>>>>>>>>>> >>>>>>>>>>> we >>>>>>>> >>>>>>>>> That sounds like a more general problem. It might mean, in >>>>>>>>>>> effect, >>>>>>>>>>> >>>>>>>>>> that >>>>>>>> >>>>>>>>> no >>>>>>>>>>> one would be able to use existing off-the-shelf componentry >>>>>>>>>>> (specs >>>>>>>>>>> >>>>>>>>>> & code >>>>>>>> >>>>>>>>> ... is that the implication, Those Who Know S-U?) and that might >>>>>>>>>>> >>>>>>>>>> well be >>>>>>>> >>>>>>>>> a >>>>>>>>>>> solid answer to "why not [use S-U]?" >>>>>>>>>>> >>>>>>>>>> The fact that reordering the clauses doesn't change the semantics >>>>>>>>>> is >>>>>>>>>> >>>>>>>>> a >>>>>>>> >>>>>>>>> feature of SPARQL. It means that queries can be rearranged for >>>>>>>>>> optimisation purposes. But you never know if the execution plan >>>>>>>>>> will >>>>>>>>>> be the best one, and you can end up with huge intermediate result >>>>>>>>>> sets. >>>>>>>>>> >>>>>>>>>> In any case, if we ever go down the SPARQL Update way, I will ask >>>>>>>>>> >>>>>>>>> that >>>>>>>> >>>>>>>>> we specify that clauses are executed in order, or something like >>>>>>>>>> >>>>>>>>> that. >>>>>>>> >>>>>>>>> And I will ask for a semantics that doesn't rely on result sets if >>>>>>>>>> possible. >>>>>>>>>> >>>>>>>>>> Were there any other big-animal issues you found, those two >>>>>>>>>>> aside? >>>>>>>>>>> >>>>>>>>>> A big issue for me will be to correctly explain the subset of >>>>>>>>>> SPARQL >>>>>>>>>> we would be considering, and its limitations compared to its big >>>>>>>>>> brother. >>>>>>>>>> >>>>>>>>>> Also, if you don't implement it from scratch and want to rely on >>>>>>>>>> an >>>>>>>>>> existing implementation, you would still have to reject all the >>>>>>>>>> correct SPARQL queries, and that can be tricky too, because you >>>>>>>>>> have >>>>>>>>>> to inspect the query after it is parsed. Oh, and I will make sure >>>>>>>>>> there are tests rejecting such queries :-) >>>>>>>>>> >>>>>>>>>> Alexandre >>>>>>>>>> >>>>>>>>>> Best Regards, John >>>>>>>>>>> >>>>>>>>>>> Voice US 845-435-9470 BluePages >>>>>>>>>>> Cloud and Smarter Infrastructure OSLC Lead >>>>>>>>>>> >>>>>>>>>>> >>>>>> > >
Received on Sunday, 27 July 2014 13:38:21 UTC