Re: SPARQL subset as a PATCH format for LDP from Sandro Hawke on 2014-07-27 (public-ldp-wg@w3.org from July 2014)

From: Sandro Hawke <sandro@w3.org>
Date: Sun, 27 Jul 2014 10:58:21 -0400
To: Andrei Sambra <andrei.sambra@gmail.com>
CC: Alexandre Bertails <alexandre@bertails.org>, ashok malhotra <ashok.malhotra@oracle.com>, "public-ldp-wg@w3.org" <public-ldp-wg@w3.org>
Message-ID: <53D5138D.5020004@w3.org>
On 07/27/2014 09:37 AM, Andrei Sambra wrote:
>
>
>
> On Sat, Jul 26, 2014 at 11:35 PM, Sandro Hawke <sandro@w3.org 
> <mailto:sandro@w3.org>> wrote:
>
>     On 07/26/2014 10:20 PM, Alexandre Bertails wrote:
>
>         On Sat, Jul 26, 2014 at 5:59 PM, Sandro Hawke <sandro@w3.org
>         <mailto:sandro@w3.org>> wrote:
>
>             On 07/26/2014 02:55 PM, Alexandre Bertails wrote:
>
>                 On Sat, Jul 26, 2014 at 1:52 PM, Sandro Hawke
>                 <sandro@w3.org <mailto:sandro@w3.org>> wrote:
>
>                     On 07/26/2014 01:44 PM, Ashok Malhotra wrote:
>
>                         Hi Sandro:
>                         Thanks for the pointers.  I read some of the
>                         mail and the conclusion I
>                         came
>                         to seems a bit different from what you
>                         concluded.  I did not see a big
>                         push for
>                         SPARQL.  Instead I found from
>                         http://lists.w3.org/Archives/Public/public-rdf-shapes/2014Jul/0206.html:
>
>                         "The other possibilities, no matter what the
>                         outcome of the workshop,
>                         *are*
>                         ready to be standardized and I rather suspect
>                         some work on combining the
>                         best elements of each will get us much
>                         further, must faster than trying
>                         to
>                         mature ShEx."
>
>                         So, this argues for leading with existing
>                         solutions, ICV and SPIN,
>                         rather
>                         than
>                         with ShEX because the other solution have some
>                         implementation and
>                         experience
>                         behind them.  Makes perfect sense.
>
>                         But the PATCH case seems to be different as
>                         AFAIK there are no other
>                         existing
>                         solutions.
>
>                 We can always argue if they are suitable for the
>                 problem, but other
>                 existing/potential solutions include: SPARQL Update in
>                 full, 2 subsets
>                 of SPARQL Update, and RDF Patch + skolemization.
>
>                     Isn't SPARQL UPDATE an existing solution for PATCH?
>
>                     It serves the basic purpose, although it has some
>                     drawbacks, like bad
>                     worst-case performance and being fairly hard to
>                     implement.
>
>                     Those same things, however, could quite reasonably
>                     be said about ICV and
>                     SPIN.
>
>                 I don't know about ICV, SPIN or ShEx (ok, just a
>                 little bit, maybe).
>
>
>             To be clear, they are only relevant as another example of
>             how inventing
>             something which could be done by SPARQL (even if
>             painfully) gets a lot of
>             pushback.
>
>         Have you considered that the pushback _could_ be justified?
>
>         For example, I really like SPARQL, for several reasons, but as
>         I have
>         explained, I really think it is not appropriate as a PATCH
>         format for
>         LDP.
>
>
>                  I just have two remarks:
>
>                 * SPARQL Update as a whole was developed for RDF
>                 databases, namely
>                 quad stores, with expressive power from the rest of
>                 SPARQL. I don't
>                 know if it was designed with use-cases as in RDF
>                 Validation, but I do
>                 know it was not designed for the use-case of updating
>                 LDP-RS on the
>                 LDP platform.
>                 * building a technology on top of an existing one is
>                 something I tend
>                 to prefer whenever it makes sense. But in our case, we
>                 are talking
>                 about taking the subset of an existing language, while
>                 remaining
>                 compatible with it. This is *not* as easy as it seems
>                 at first.
>
>                 I would prefer to hear about concrete proposals on how
>                 to do that. As
>                 somebody who _cannot_ rely on an existing SPARQL
>                 implementations, and
>                 who is not planning to implement one in full for that
>                 use-case, I
>                 would like to see a concrete syntax written down, with
>                 a formal
>                 semantics for it.
>
>
>             Okay, I'm going to make two concrete proposals.
>
>             1.  Just use SPARQL 1.1 Update.   The whole thing.   I
>             know it doesn't
>             handle lists well.  What else is wrong with it?  Why can
>             you not use it?
>
>         I became interested in LDP because it was the first time RDF was
>         becoming a first-class citizen of the Web, by that I mean
>         applications
>         can interact (read/write) *directly* with RDF resources using
>         HTTP,
>         without being behind an endpoint. That's what we meant by LDP
>         being
>         the intersection of RDF and REST.
>
>         The W3C has finally recognized a few years ago that native RDF
>         was not
>         the only use-case for RDF applications. You can now have a
>         relational
>         database (RDB2RDF), CSV files (RDF for Tabular Data), XML (GRDDL,
>         XSLT), etc. But not necessarily a triple/quad store. For
>         example, at
>         the company I work for, we have several (ie. physically
>         disconnected)
>         Datomic and Cassandra servers, and we are now exposing some of the
>         data behind LDP, with the objective of doing for all of our
>         data. In
>         all those cases, we want to expose and link our data on the
>         Web, like
>         all those so-called RESTful APIs, but in a more consistent
>         way, and
>         using RDF as the model and the exchange data format. Hence
>         LDP, and
>         not yet-another-web-api.
>
>         The reason I am telling you all that is that supporting SPARQL for
>         those virtual RDF datasets is not that easy (when possible)
>         when you
>         don't have a quadstore as your backend. Reverse mapping for simple
>         SPARQL queries is hard. And SPARQL Update is even worse to
>         support.
>         Basically, forcing SPARQL Update onto LDP facing applications for
>         simple resource updates on single LDP-RS (ie. PATCH) is like
>         using a
>         hammer to kill a fly.
>
>         So full SPARQL Update is simply a no-go for me. I just cannot
>         support
>         it completely, as some features cannot correctly be mapped to
>         Datomic
>         and Cassandra.
>
>
>     So this is the key.   You want to be able to support PATCH on
>     databases that are not materialized as either triples OR as SQL.
>
>     If the database was SQL, then (as I understand it), SPARQL Update
>     would be okay, because it can be mapped to SQL.
>
>     But you don't know how to map SPARQL Update to NoSQL databases, or
>     it's just too much work.
>
>     I take it you do know how to map LD-Patch to Cassandra and Datomic?
>
>     [ BTW, Datomic sounds awesome.  Is it as fun to use as I'd imagine? ]
>
>
>
>
>         Also, if I was in a case where SPARQL Update was ok for me to use
>         (it's not), then I suspect that I wouldn't need LDP at all,
>         and SPARQL
>         + Update + Graph Store protocol would just be enough. And there is
>         nothing preventing one from using SPARQL Update right now.
>         Just don't
>         call it LD Patch.
>
>
>     It's not about what's called what, it's about what we promote as
>     the the PATCH format.   If we had a simple enough PATCH format,
>     then we could possibly make it a MUST to implement in the next
>     version of LDP.
>
>
> I think Alexandre makes a valid point. For a spec (LDP) that 
> explicitly tried to avoid SPARQL, using this format for PATCH makes 
> absolutely no sense to me.
>
>
>     I don't think SPARQL Update is simple enough for that, but my
>     prediction is the LD-Patch will turn out, sadly, to not be either.
>
>
>
>             2.  Use SPARQL 1.1 Update with an extension to handle
>             lists well.
>             Specifically, it would be a slice function, usable in
>             FILTER and especially
>             in BIND.   This seems like a no-brainer to include in
>             SPARQL 1.2.  I'd want
>             to talk to a few of the SPARQL implementers and see if
>             they're up for adding
>             it.    Maybe a full set of list functions like [1].
>
>         Sorry but I don't know RIF and your idea is still very vague
>         for me. I
>         understand how you can provide new functions for matching
>         nodes in an
>         rdf:list but I fail to see how this plays in a SPARQL Update
>         query.
>
>         Can you just provide some examples where you are doing the
>         equivalent
>         of that python code (I know read python):
>
>
>     Probably not worthwhile to go into this now, given your veto on
>     SPARQL.
>
>
>
>         [[
>
>                     l = [1,2,3,4,5,6,7,8,9,10]
>                     l[2:2] = [11,12]
>                     l[2:7] = [13,14]
>                     l[2:] = [15,16]
>                     l.append(17)
>
>         ]]
>
>             If we want a subset, we could define it purely by
>             restricting the grammar --
>             eg leaving out the stuff that does federation, negation,
>             aggregation, --
>             with no need to say anything about the semantics except
>             they are the same as
>             SPARQL.   Until I hear what the problem is with SPARQL,
>             though, I don't want
>             to start excluding stuff.
>
>         Am I the only one thinking that "no need to say anything about the
>         semantics except they are the same as SPARQL" is just plain wrong?
>
>         I mean, would we really tell implementers and users of the
>         technology
>         that they have to go learn SPARQL before they can start
>         understanding
>         what subset correctly apply to LD Patch? And how? And would
>         they still
>         need to carry this ResultSet semantics over while a lot of us
>         would
>         explicitly prefer avoiding it?
>
>
>     I think the users who are writing PATCHes by hand will be familiar
>     with SPARQL.  And if they are not, there are lots of other reasons
>     to learn it.
>
>
> Except that LDP explicitly made a point to avoid SPARQL. Since the LDP 
> model is all about interacting with resources by using their 
> individual URIs, PATCH-ing resources through a SPARQL endpoint goes 
> against the core LDP believes.
>

Note that no one is proposing using a SPARQL endpoint -- just that 
SPARQL Update be used as a HTTP PATCH format.    (This is an idea that 
the SPARQL WG also suggested for GSP.)

       -- Sandro

> -- Andrei
>
>
>     Contrast that with LD-Patch, for which there is no other reason it.
>
>     You seem to think LD-Patch's syntax and semantics are easy.   I
>     don't think they are.   Maybe if you expanded the path syntax only
>     many rows it would be more clear what it means.
>
>     I can't help but regret again that we didn't chose to use
>     TurtlePatch (which I first wrote on your wall, the week after the
>     workshop - even if I didn't figure out how to handle bnodes until
>     this year). https://www.w3.org/2001/sw/wiki/TurtlePatch
>
>            -- Sandro
>
>
>
>         Alexandre
>
>                  -- Sandro
>
>
>             [1]
>             http://www.w3.org/TR/rif-dtb/#Functions_and_Predicates_on_RIF_Lists
>
>
>
>                 Alexandre
>
>                           -- Sandro
>
>
>                         All the best, Ashok
>                         On 7/26/2014 6:10 AM, Sandro Hawke wrote:
>
>                             On July 25, 2014 2:48:28 PM EDT, Alexandre
>                             Bertails
>                             <alexandre@bertails.org
>                             <mailto:alexandre@bertails.org>> wrote:
>
>                                 On Fri, Jul 25, 2014 at 11:51 AM,
>                                 Ashok Malhotra
>                                 <ashok.malhotra@oracle.com
>                                 <mailto:ashok.malhotra@oracle.com>> wrote:
>
>                                     Alexandre:
>                                     The W3C held a RDF Validation
>                                     Workshop last year.
>                                     One of the questions that
>                                     immediately came up was
>                                     "We can use SPARQL to validate
>                                     RDF".  The answer was
>                                     that SPARQL was to complex and too
>                                     hard to learn.
>                                     So, we compromised and the
>                                     workshop recommended
>                                     that a new RDF validation language
>                                     should be developed
>                                     to cover the simple cases and
>                                     SPARQL could be used when
>                                     things got complex.
>
>                                     It seems to me that you can make a
>                                     similar argument
>                                     for RDF Patch.
>
>                                 I totally agree with that.
>
>                             Thanks for bringing this up, Ashok.    I'm
>                             going to use the same
>                             situation to argue the opposite.
>
>                             It's relatively easy for a group of
>                             people, especially at a face to
>                             face
>                             meeting, too come to the conclusion SPARQL
>                             is too hard to learn and we
>                             should invent something else.    But when
>                             we took it to the wider
>                             world, we
>                             got a reaction that's so strong it's hard
>                             not to characterize as
>                             violent.
>
>                             You might want to read:
>
>
>                             http://lists.w3.org/Archives/Public/public-rdf-shapes/2014Jul/thread.html
>
>                             Probably the most recent ones right now
>                             give a decent summary and you
>                             don't have to read them all.
>
>                             I have lots of theories to explain the
>                             disparity.   Like: people who
>                             have
>                             freely chosen to join an expedition are
>                             naturally more inclined to go
>                             somewhere interesting.
>
>                             I'm not saying we can't invent something
>                             new, but be sure to understand
>                             the battle to get it standardized may be
>                             harder than just implementing
>                             SPARQL everywhere.
>
>                                     - Sandro
>
>                                 Alexandre
>
>                                     All the best, Ashok
>
>
>                                     On 7/25/2014 9:34 AM, Alexandre
>                                     Bertails wrote:
>
>                                         On Fri, Jul 25, 2014 at 8:04
>                                         AM, John Arwe
>                                         <johnarwe@us.ibm.com
>                                         <mailto:johnarwe@us.ibm.com>>
>
>                                 wrote:
>
>                                                 Another problem is the
>                                                 support for rdf:list.
>                                                 I have just finished
>                                                 writing down the
>                                                 semantics for
>                                                 UpdateList and based
>                                                 on that
>                                                 experience, I know
>                                                 this is something I
>                                                 want to rely on as a user,
>                                                 because it is so easy
>                                                 to get it wrong, so I
>                                                 want native support
>
>                                 for
>
>                                                 it. And I don't think
>                                                 it is possible to do
>                                                 something equivalent in
>                                                 SPARQL Update. That is
>                                                 a huge drawback as
>                                                 list manipulation (eg.
>
>                                 in
>
>                                                 JSON-LD, or Turtle) is
>                                                 an everyday task.
>
>                                             Is semantics for
>                                             UpdateList  (that you
>                                             wrote down) somewhere that
>
>                                 WG
>
>                                             members
>                                             can look at it, and
>                                             satisfy themselves that
>                                             they agree with your
>                                             conclusion?
>
>                                         You can find the semantics at
>                                         [1]. Even if still written in
>                                         Scala
>
>                                 for
>
>                                         now, this is written in a
>                                         (purely functional) style,
>                                         which is very
>                                         close to the formalism that
>                                         will be used for the operational
>
>                                 semantics
>
>                                         in the spec. Also, note that
>                                         this is the most complex part
>                                         of the
>                                         entire semantics, all the rest
>                                         being pretty simple, even
>                                         Paths. And
>
>                                 I
>
>                                         spent a lot of time finding
>                                         the general solution while
>                                         breaking it
>
>                                 in
>
>                                         simpler sub-parts.
>
>                                         In a nutshell, you have 3
>                                         steps: first you move to the
>                                         left bound,
>                                         then you gather triples to
>                                         delete until the right bound,
>                                         and you
>                                         finally insert the new triples
>                                         in the middle. It's really tricky
>                                         because 1. you want to
>                                         minimize the number of
>                                         operations, even if
>
>                                 this
>
>                                         is only a spec 2. unlike usual
>                                         linked lists with pointers, you
>                                         manipulate triples, so the
>                                         pointer in question is only
>                                         the node in
>
>                                 the
>
>                                         object position in the triple,
>                                         and you need to remember and carry
>
>                                 the
>
>                                         corresponding
>                                         subject-predicate 3.
>                                         interesting (ie. weird) things
>
>                                 can
>
>                                         happen at the limits of the
>                                         list if you don't pay attention.
>
>                                         [1]
>
>                                 https://github.com/betehess/banana-rdf/blob/ldpatch/patch/src/main/scala/Semantics.scala#L62
>
>                                             I'm not steeped enough in
>                                             the intracacies of SPARQL
>                                             Update to have
>
>                                 a
>
>                                             horse
>                                             in this race, but if this
>                                             issue is the big-animal
>                                             difference then
>
>                                 people
>
>                                             with the necessary
>                                             understanding are going to
>                                             want to see the
>
>                                 details.
>
>                                             The
>                                             IBM products I'm aware of
>                                             eschew rdf:List (and blank
>                                             nodes
>
>                                 generally, to
>
>                                             first order), so I don't
>                                             know how much this one
>                                             alone would sway
>
>                                 me.
>
>                                         You _could_ generate a SPARQL
>                                         Update query that would do
>                                         something
>                                         equivalent. But you'd have to
>                                         match and remember the
>                                         intermediate
>                                         nodes/triples.
>
>                                         JSON-LD users manipulate lists
>                                         on a day-to-day basis. Without
>                                         native
>                                         support for rdf:list in LD
>                                         Patch, I would turn to JSON
>                                         PATCH to
>                                         manipulate those lists.
>
>                                             It sounds like the other
>                                             big-animal difference in
>                                             your email is
>
>                                                 we would have to
>                                                 refine the SPARQL
>                                                 semantics so that the
>                                                 order of
>
>                                 the
>
>                                                 clauses matters (ie.
>                                                 no need to depend on a
>                                                 query optimiser). And
>
>                                 we
>
>                                             That sounds like a more
>                                             general problem.  It might
>                                             mean, in effect,
>
>                                 that
>
>                                             no
>                                             one would be able to use
>                                             existing off-the-shelf
>                                             componentry (specs
>
>                                 & code
>
>                                             ... is that the
>                                             implication, Those Who
>                                             Know S-U?) and that might
>
>                                 well be
>
>                                             a
>                                             solid answer to "why not
>                                             [use S-U]?"
>
>                                         The fact that reordering the
>                                         clauses doesn't change the
>                                         semantics is
>
>                                 a
>
>                                         feature of SPARQL. It means
>                                         that queries can be rearranged for
>                                         optimisation purposes. But you
>                                         never know if the execution
>                                         plan will
>                                         be the best one, and you can
>                                         end up with huge intermediate
>                                         result
>                                         sets.
>
>                                         In any case, if we ever go
>                                         down the SPARQL Update way, I
>                                         will ask
>
>                                 that
>
>                                         we specify that clauses are
>                                         executed in order, or
>                                         something like
>
>                                 that.
>
>                                         And I will ask for a semantics
>                                         that doesn't rely on result
>                                         sets if
>                                         possible.
>
>                                             Were there any other
>                                             big-animal issues you
>                                             found, those two aside?
>
>                                         A big issue for me will be to
>                                         correctly explain the subset
>                                         of SPARQL
>                                         we would be considering, and
>                                         its limitations compared to
>                                         its big
>                                         brother.
>
>                                         Also, if you don't implement
>                                         it from scratch and want to
>                                         rely on an
>                                         existing implementation, you
>                                         would still have to reject all the
>                                         correct SPARQL queries, and
>                                         that can be tricky too,
>                                         because you have
>                                         to inspect the query after it
>                                         is parsed. Oh, and I will make
>                                         sure
>                                         there are tests rejecting such
>                                         queries :-)
>
>                                         Alexandre
>
>                                             Best Regards, John
>
>                                             Voice US 845-435-9470
>                                             <tel:845-435-9470>  BluePages
>                                             Cloud and Smarter
>                                             Infrastructure OSLC Lead
>
>
>
>
>
Received on Sunday, 27 July 2014 14:58:32 UTC