Re: SPARQL subset as a PATCH format for LDP from Alexandre Bertails on 2014-07-26 (public-ldp-wg@w3.org from July 2014)

From: Alexandre Bertails <alexandre@bertails.org>
Date: Sat, 26 Jul 2014 14:55:38 -0400
To: Sandro Hawke <sandro@w3.org>
Cc: ashok.malhotra@oracle.com, "public-ldp-wg@w3.org" <public-ldp-wg@w3.org>
Message-ID: <CANvn8kyJ6eDxN9GXHbi_8ZrEo0gKkBbOifqDw6GhFVyXMy1ZfA@mail.gmail.com>
On Sat, Jul 26, 2014 at 1:52 PM, Sandro Hawke <sandro@w3.org> wrote:
> On 07/26/2014 01:44 PM, Ashok Malhotra wrote:
>>
>> Hi Sandro:
>> Thanks for the pointers.  I read some of the mail and the conclusion I
>> came
>> to seems a bit different from what you concluded.  I did not see a big
>> push for
>> SPARQL.  Instead I found from
>> http://lists.w3.org/Archives/Public/public-rdf-shapes/2014Jul/0206.html:
>>
>> "The other possibilities, no matter what the outcome of the workshop,
>> *are*
>> ready to be standardized and I rather suspect some work on combining the
>> best elements of each will get us much further, must faster than trying to
>> mature ShEx."
>>
>> So, this argues for leading with existing solutions, ICV and SPIN, rather
>> than
>> with ShEX because the other solution have some implementation and
>> experience
>> behind them.  Makes perfect sense.
>>
>> But the PATCH case seems to be different as AFAIK there are no other
>> existing
>> solutions.

We can always argue if they are suitable for the problem, but other
existing/potential solutions include: SPARQL Update in full, 2 subsets
of SPARQL Update, and RDF Patch + skolemization.

>>
>
> Isn't SPARQL UPDATE an existing solution for PATCH?
>
> It serves the basic purpose, although it has some drawbacks, like bad
> worst-case performance and being fairly hard to implement.
>
> Those same things, however, could quite reasonably be said about ICV and
> SPIN.

I don't know about ICV, SPIN or ShEx (ok, just a little bit, maybe). I
just have two remarks:

* SPARQL Update as a whole was developed for RDF databases, namely
quad stores, with expressive power from the rest of SPARQL. I don't
know if it was designed with use-cases as in RDF Validation, but I do
know it was not designed for the use-case of updating LDP-RS on the
LDP platform.
* building a technology on top of an existing one is something I tend
to prefer whenever it makes sense. But in our case, we are talking
about taking the subset of an existing language, while remaining
compatible with it. This is *not* as easy as it seems at first.

I would prefer to hear about concrete proposals on how to do that. As
somebody who _cannot_ rely on an existing SPARQL implementations, and
who is not planning to implement one in full for that use-case, I
would like to see a concrete syntax written down, with a formal
semantics for it.

Alexandre

>
>       -- Sandro
>
>
>>
>> All the best, Ashok
>> On 7/26/2014 6:10 AM, Sandro Hawke wrote:
>>>
>>> On July 25, 2014 2:48:28 PM EDT, Alexandre Bertails
>>> <alexandre@bertails.org> wrote:
>>>>
>>>> On Fri, Jul 25, 2014 at 11:51 AM, Ashok Malhotra
>>>> <ashok.malhotra@oracle.com> wrote:
>>>>>
>>>>> Alexandre:
>>>>> The W3C held a RDF Validation Workshop last year.
>>>>> One of the questions that immediately came up was
>>>>> "We can use SPARQL to validate RDF".  The answer was
>>>>> that SPARQL was to complex and too hard to learn.
>>>>> So, we compromised and the workshop recommended
>>>>> that a new RDF validation language should be developed
>>>>> to cover the simple cases and SPARQL could be used when
>>>>> things got complex.
>>>>>
>>>>> It seems to me that you can make a similar argument
>>>>> for RDF Patch.
>>>>
>>>> I totally agree with that.
>>>>
>>> Thanks for bringing this up, Ashok.    I'm going to use the same
>>> situation to argue the opposite.
>>>
>>> It's relatively easy for a group of people, especially at a face to face
>>> meeting, too come to the conclusion SPARQL is too hard to learn and we
>>> should invent something else.    But when we took it to the wider world, we
>>> got a reaction that's so strong it's hard not to characterize as violent.
>>>
>>> You might want to read:
>>>
>>> http://lists.w3.org/Archives/Public/public-rdf-shapes/2014Jul/thread.html
>>>
>>> Probably the most recent ones right now give a decent summary and you
>>> don't have to read them all.
>>>
>>> I have lots of theories to explain the disparity.   Like: people who have
>>> freely chosen to join an expedition are naturally more inclined to go
>>> somewhere interesting.
>>>
>>> I'm not saying we can't invent something new, but be sure to understand
>>> the battle to get it standardized may be harder than just implementing
>>> SPARQL everywhere.
>>>
>>>       - Sandro
>>>
>>>> Alexandre
>>>>
>>>>> All the best, Ashok
>>>>>
>>>>>
>>>>> On 7/25/2014 9:34 AM, Alexandre Bertails wrote:
>>>>>>
>>>>>> On Fri, Jul 25, 2014 at 8:04 AM, John Arwe <johnarwe@us.ibm.com>
>>>>
>>>> wrote:
>>>>>>>>
>>>>>>>> Another problem is the support for rdf:list. I have just finished
>>>>>>>> writing down the semantics for UpdateList and based on that
>>>>>>>> experience, I know this is something I want to rely on as a user,
>>>>>>>> because it is so easy to get it wrong, so I want native support
>>>>
>>>> for
>>>>>>>>
>>>>>>>> it. And I don't think it is possible to do something equivalent in
>>>>>>>> SPARQL Update. That is a huge drawback as list manipulation (eg.
>>>>
>>>> in
>>>>>>>>
>>>>>>>> JSON-LD, or Turtle) is an everyday task.
>>>>>>>
>>>>>>> Is semantics for UpdateList  (that you wrote down) somewhere that
>>>>
>>>> WG
>>>>>>>
>>>>>>> members
>>>>>>> can look at it, and satisfy themselves that they agree with your
>>>>>>> conclusion?
>>>>>>
>>>>>> You can find the semantics at [1]. Even if still written in Scala
>>>>
>>>> for
>>>>>>
>>>>>> now, this is written in a (purely functional) style, which is very
>>>>>> close to the formalism that will be used for the operational
>>>>
>>>> semantics
>>>>>>
>>>>>> in the spec. Also, note that this is the most complex part of the
>>>>>> entire semantics, all the rest being pretty simple, even Paths. And
>>>>
>>>> I
>>>>>>
>>>>>> spent a lot of time finding the general solution while breaking it
>>>>
>>>> in
>>>>>>
>>>>>> simpler sub-parts.
>>>>>>
>>>>>> In a nutshell, you have 3 steps: first you move to the left bound,
>>>>>> then you gather triples to delete until the right bound, and you
>>>>>> finally insert the new triples in the middle. It's really tricky
>>>>>> because 1. you want to minimize the number of operations, even if
>>>>
>>>> this
>>>>>>
>>>>>> is only a spec 2. unlike usual linked lists with pointers, you
>>>>>> manipulate triples, so the pointer in question is only the node in
>>>>
>>>> the
>>>>>>
>>>>>> object position in the triple, and you need to remember and carry
>>>>
>>>> the
>>>>>>
>>>>>> corresponding subject-predicate 3. interesting (ie. weird) things
>>>>
>>>> can
>>>>>>
>>>>>> happen at the limits of the list if you don't pay attention.
>>>>>>
>>>>>> [1]
>>>>>>
>>>>
>>>> https://github.com/betehess/banana-rdf/blob/ldpatch/patch/src/main/scala/Semantics.scala#L62
>>>>>>>
>>>>>>> I'm not steeped enough in the intracacies of SPARQL Update to have
>>>>
>>>> a
>>>>>>>
>>>>>>> horse
>>>>>>> in this race, but if this issue is the big-animal difference then
>>>>
>>>> people
>>>>>>>
>>>>>>> with the necessary understanding are going to want to see the
>>>>
>>>> details.
>>>>>>>
>>>>>>> The
>>>>>>> IBM products I'm aware of eschew rdf:List (and blank nodes
>>>>
>>>> generally, to
>>>>>>>
>>>>>>> first order), so I don't know how much this one alone would sway
>>>>
>>>> me.
>>>>>>
>>>>>> You _could_ generate a SPARQL Update query that would do something
>>>>>> equivalent. But you'd have to match and remember the intermediate
>>>>>> nodes/triples.
>>>>>>
>>>>>> JSON-LD users manipulate lists on a day-to-day basis. Without native
>>>>>> support for rdf:list in LD Patch, I would turn to JSON PATCH to
>>>>>> manipulate those lists.
>>>>>>
>>>>>>> It sounds like the other big-animal difference in your email is
>>>>>>>
>>>>>>>> we would have to refine the SPARQL semantics so that the order of
>>>>
>>>> the
>>>>>>>>
>>>>>>>> clauses matters (ie. no need to depend on a query optimiser). And
>>>>
>>>> we
>>>>>>>
>>>>>>> That sounds like a more general problem.  It might mean, in effect,
>>>>
>>>> that
>>>>>>>
>>>>>>> no
>>>>>>> one would be able to use existing off-the-shelf componentry (specs
>>>>
>>>> & code
>>>>>>>
>>>>>>> ... is that the implication, Those Who Know S-U?) and that might
>>>>
>>>> well be
>>>>>>>
>>>>>>> a
>>>>>>> solid answer to "why not [use S-U]?"
>>>>>>
>>>>>> The fact that reordering the clauses doesn't change the semantics is
>>>>
>>>> a
>>>>>>
>>>>>> feature of SPARQL. It means that queries can be rearranged for
>>>>>> optimisation purposes. But you never know if the execution plan will
>>>>>> be the best one, and you can end up with huge intermediate result
>>>>>> sets.
>>>>>>
>>>>>> In any case, if we ever go down the SPARQL Update way, I will ask
>>>>
>>>> that
>>>>>>
>>>>>> we specify that clauses are executed in order, or something like
>>>>
>>>> that.
>>>>>>
>>>>>> And I will ask for a semantics that doesn't rely on result sets if
>>>>>> possible.
>>>>>>
>>>>>>> Were there any other big-animal issues you found, those two aside?
>>>>>>
>>>>>> A big issue for me will be to correctly explain the subset of SPARQL
>>>>>> we would be considering, and its limitations compared to its big
>>>>>> brother.
>>>>>>
>>>>>> Also, if you don't implement it from scratch and want to rely on an
>>>>>> existing implementation, you would still have to reject all the
>>>>>> correct SPARQL queries, and that can be tricky too, because you have
>>>>>> to inspect the query after it is parsed. Oh, and I will make sure
>>>>>> there are tests rejecting such queries :-)
>>>>>>
>>>>>> Alexandre
>>>>>>
>>>>>>> Best Regards, John
>>>>>>>
>>>>>>> Voice US 845-435-9470  BluePages
>>>>>>> Cloud and Smarter Infrastructure OSLC Lead
>>>>>>>
>>>>>
>>>
>>
>>
>>
>
>
Received on Saturday, 26 July 2014 18:56:07 UTC