Re: a few patch challenges from Alexandre Bertails on 2014-08-12 (public-ldp-wg@w3.org from August 2014)

From: Alexandre Bertails <alexandre@bertails.org>
Date: Tue, 12 Aug 2014 15:15:44 -0400
To: Sandro Hawke <sandro@w3.org>
Cc: Linked Data Platform WG <public-ldp-wg@w3.org>
Message-ID: <CANvn8kzkQ9UaT9g1GpD3LJQ9smj_VtNweOik4GE9vS_5gs7hag@mail.gmail.com>
On Tue, Aug 12, 2014 at 2:01 PM, Sandro Hawke <sandro@w3.org> wrote:
> On 08/12/2014 07:38 AM, Alexandre Bertails wrote:
>>
>> On Mon, Aug 11, 2014 at 9:18 PM, Sandro Hawke <sandro@w3.org> wrote:
>>>
>>> On 08/11/2014 04:12 PM, Alexandre Bertails wrote:
>>>
>>> On Mon, Aug 11, 2014 at 11:39 AM, Sandro Hawke <sandro@w3.org> wrote:
>>>
>>> Arguably we should have made a patch test suite years ago.
>>>
>>> Off the top of my head, here are a few patch challenges.    It's not
>>> necessarily a requirement that they all be done, but I think it would
>>> help
>>> show the differences to see how they are done in ldpatch, etc.
>>>
>>> You can find the result at [1].
>>>
>>>
>>> Impressive, thanks.
>>>
>>> Do you have code to generate patches?    I gather you made these ones by
>>> hand.
>>
>> I did them by hand. TimBL said he had something doing that, and I
>> believe one can do a decent job at generating an LD Patch document
>> from a diff.
>>
>> That being said, I am not planning to take on this approach as my
>> applications don't manipulate RDF directly. Instead of manipulating
>> triples directly, you usually manipulate datastructures/objects in
>> your favourite language. There are many ways to track effects
>> happening to them, either a priori or a posteriori. Generating the
>> patch is much easier in that case. But I digress.
>
>
> I'm not sure it is easier, in the general case, since the patch in some
> cases needs to work backwards through the graph from multiple literals.   I
> don't see how non-graph-away tools will manage that.
>
> This sounds like a good programming contest / hackathon / hiring puzzle
> problem.  :-)
>
>
>>> Some syntactic concerns which I've mentioned before.  My apologies if
>>> there's been a response I missed.   Also, the attendeeOf bug is still
>>> there
>>> in the spec.
>>>
>>> 1.  Why these seemingly gratuitous differences from SPARQL path
>>> expression:
>>>
>>> - instead of ^ for inverse path operator
>>
>> As we discussed once, nobody is against that change. I proposed that
>> we'd make that kind of change as a group resolution, after we get the
>> FPWD.
>
>
> Why would you make a change like this after FPWD instead of before? A WD
> should contain the best we've got, and what we want public feedback on.

Here are the various enhancements I have heard so far:

* inverse path syntax
  * - (status quo)
  * ^
  * \ (as a contraction for / and - )

* constraint path syntax
  * [ ] (status quo)
  * ( )

* slice operator syntax
  * > (status quo)
  * :
  * ..

If we happen to decide to move forward next Monday with the FPWD, we
can decide at the same time re: syntax enhancements. A deadlock
because of -1s would make us fall back to the status quo, so that we
don't loose time.

For me, those are mainly aesthetical questions, modulo the misleading
expectations I discuss below.

>
>
>>
>> As we want to save some bits, I would also propose \ (backslash) which
>> could combine "/-".
>
>
> Uh, no.    -1.   Backslash has its meanings across languages and systems,
> and this is a very poor match for them.

Fine.

>
>>> paths start with / instead of having it be an infix operator
>>
>> Actually, it is meant to be infix, where the left operand is a
>> concrete node (or a set of node). I agree it is a bit awkward in the
>> path constraints, as the current node(s) is(are) implicit.
>>
>> Maybe the intent is better understood with a few more spaces: `<node>
>> /<foo> /-<bar>`.
>>
>> That plus the path constraints, it has different semantics than SPARQL
>> Path.
>
>
> This needs to be justified.   I don't see why it can't be SPARQL Paths that
> only use "/" and "^" and which also include filtering.

Well, when I see something that "looks" like something existing, I
expect it to have similar semantics. But this has nothing to do with
SPARQL. For example, [ ] are not used for grouping disjunctions but
for predicates, with a totally different meaning. Also, the general
semantics is not defined as "a possible route through a graph between
two graph nodes" but as a sequence of path steps and constraints,
evaluated from left to right, applied to a node set (initially with
one element), and producing intermediate node sets. Very much like in
xpath actually...

So making LD Paths look like SPARQL Property Paths is misleading and
only introduces confusion.

>
>
>>> ?
>>>
>>> 2.  Why > as the slice operator?   It looks like the vast majority [1] of
>>> languages use colon, and the the rest use "..".   I think either of those
>>> would be familiar to experienced programmers, unlike >.    I suppose I'd
>>> pick .. because colon seems too much like a pname, but I'm fine with
>>> colon,
>>> too.
>>>
>>> [1] http://en.wikipedia.org/wiki/Array_slicing
>>
>> Actually, Pierre-Antoine wanted to change that but like other things,
>> we ultimately decided to stick again to his own original proposal
>> (that was the plan), so that we can decide as a group which syntax
>> would be better.
>
>
> How do you propose we decide?

Well, the original plan we agreed on as a group was for me to specify
Pierre-Antoine's solution as presented during the last F2F. I didn't
want to introduce too much on my own. We (the 3 editors) have settled
on making a few changes as possible until we discussed with the group.

>
> I see two main options: (1) the editors make the change requested by the
> commenter if they like it and don't think anyone else will object to the
> change, or (2) someone raises it as an Issue.

I personally like the .. proposal, but I haven't had the opportunity
to discuss it with Pierre-Antoine and Andrei.

>
> If the editors are agreed, it seems like much less work to just change it.

If it was only me, without consulting the group, I would 1. use \  2.
keep [ ]  3. use .. .

>
>
>> At the end of the day, it doesn't really matter :-)
>
>
> Certainly not as much as ... many other things in life.   :-)
>
>
>>> 3.  Why square brackets for filter?   I'd think parens for grouping, then
>>> =
>>> for filter would do it.     Doing that and making / infix (as in sparql)
>>> instead of prefix, we'd get:
>>
>> That part just felt quite similar to xpath. Actually the whole path
>> thing is much closer to xpath than to SPARQL.
>
>
> Hm.    I've spent about a week writing xpath than did my best for forget it
> again, so I've no idea about that.

You should ask yourself how many people say the same thing when it
comes to SPARQL :-)

And xpath is mainly difficult to grasp because of its humongous
library of functions. But the main semantics is pretty simple.

>
>
>>> Bind ?event <#> schema:attendeeOf/(schema:url =
>>> <http://conferences.ted.com/TED2009/>)
>>>
>>> Alternatively, if you keep square brackets, there's no need for the equal
>>> sign, either for parsing, or for looking familiar.
>>>
>>> --
>>>
>>> I think my biggest advice, and this is what keeps making me raise my
>>> voice
>>> during meetings -- for which I'm sorry -- is to understand that you've
>>> created a new language here, and you have to teach it to your audience.
>>> You
>>> can't just say "it's just patch".  Your audience is going to take a while
>>> to
>>> learn it, as they do any new language.
>>
>> It's not just Patch. It is "just" RDF diff + node bindings + support
>> for rdf:list  :-)
>
>
> But the path expressions used in node bindings is a new language, with a new
> way of thinking, and a new syntax that people need to learn to understand
> ld-patch.   It's a relatively simple new language, but to act like it's
> trivial is sweeping stuff under a rug.

It is very much like xpath, for good reasons. And yes, it is a
different approach than SPARQL, because the problem to solve is
different.

>
>
>> By the way, I am surprised that you didn't come up with more examples
>> involving rdf:list, because this is a very important part of LD Patch.
>
>
> The list stuff happens to feel obvious and trivial to me.

I now realize that the semantics for updating list is a bit crazy but
the user doesn't need to think about it in order to use it. It just
works. And that's a good thing :-)

>
> There are some test cases to be had there, too, in terms of odd
> constructions using rdf:first and rdf:rest, and showing the rdf:Seq, etc,
> aren't touched.  (or are they...?    maybe they should be? As long as you're
> always touching an EXISTING structure, that wouldn't require the patch to be
> any different.)

I am not sure to understand.

>
>
>>> I'd suggest giving them a series of increasing examples, such as the test
>>> cases I posed to you, so they can see that blank nodes get selected from
>>> the
>>> nearest node via forward or backward arcs, then filtered if necessary.
>>
>> I completely agree. If we decide to move forward with this approach,
>> it is my very intention that the specification will double as
>> documentation.
>>
>>> Which makes me think....
>>
>> You can find the new tests at [2].
>>
>> [2]
>> https://github.com/betehess/banana-rdf/blob/efe9ba1a522768ec81649c281f5433096fb90a1d/ldpatch/src/test/scala/SandrosChallenge.scala#L248
>>
>
> Awesome.
>
> I see an error in how I posed Test11.  It should be changing Bob A Smith not
> Bob A Jones, btw, otherwise the firstname isn't needed to locate the blank
> node.
>
>
>>> == TEST 9
>>> == FROM
>>> <alice> <knows >
>>>       [ <first> "Bob", <last> "Smith" ],
>>>       [ <first> "Bob", <last> "Jones" ],
>>>       [ <first> "Charlie", <last> "Smith" ],
>>>       [ <first> "Charlie", <last> "Jones" ].
>>> == TO
>>> <alice> <knows >
>>>       [ <first> "Bob", <last> "Smith" ],
>>>       [ <first> "Bob", <last> "Jones" ],
>>>       [ <first> "Charlie", <last> "Smith" ],
>>>       [ <first> "Chuck", <last> "Jones" ].
>>> == END
>>>
>>> Every language features needs a test in the test suite, and really should
>>> be
>>> motivated by a use case.   If you allow multiple filters in one path,
>>> provide an example where that's necessary.  (I think that's what Test 9
>>> is.)
>>
>> Yes.
>>
>>> And just to push a tiny bit more, to make sure we can step after the
>>> filter:
>>>
>>> == TEST 10
>>> Somewhat artificial use of <text> but surely there are situations like
>>> this
>>> == FROM
>>> <alice> <knows >
>>>       [ <first> [ <text> "Bob"], <last> [ <text> "Smith"] ],
>>>       [ <first> [ <text> "Bob"], <last> [ <text> "Jones"] ],
>>>       [ <first> [ <text> "Charlie"], <last> [ <text> "Smith"] ],
>>>       [ <first> [ <text> "Charlie"], <last> [ <text> "Jones"] ].
>>> == TO
>>> <alice> <knows >
>>>       [ <first> [ <text> "Bob"], <last> [ <text> "Smith"] ],
>>>       [ <first> [ <text> "Bob"], <last> [ <text> "Jones"] ],
>>>       [ <first> [ <text> "Charlie"], <last> [ <text> "Smith"] ],
>>>       [ <first> [ <text> "Chuck"], <last> [ <text> "Jones"] ].
>>> == END
>>>
>>> Is the value of Bind unique, or can it be a set?
>>
>> We (the editors) talked a lot about that. The very first draft was
>> actually binding variables to sets of nodes, and so did my initial
>> implementation.
>>
>> In the end, we went back to something much closer to the RDF Patch
>> solution. That is: Add and Delete only touch one triple at a time.
>
>
> Do remember what tipped the balance in this direction?

I do. It is still easy to define Add and Delete with node sets, but it
doesn't work well in the list stuff. So there would still be
restrictions there.

Also, we couldn't find any interest for Add (as you're only matching
existing nodes in the graph) and very limited for Delete, as there is
no variable for the predicate.

Alexandre

>
>
>>>    That is, is ! implicit in bind?
>>
>> Completely spot on. It is part of the semantics. (still not flushed
>> but I hope you're now convinced that it is not difficult)
>>
>>>   If not, can every path expression be re-written as a sequence of
>>> single-step Bind operations?
>>
>> Well, you could, but don't want to do that: if you could actually
>> split your path in two, it means that you had a shorter path in the
>> first place, and that you are better off using that one.
>>
>>> To try my understanding, how's this:
>>>
>>> Bind ?CharlieJones "Charlie" /-<text>/-<first>/[/<last>/<text> =
>>> "Jones"]!
>>
>> As you guessed, the trailing ! is not needed.
>>
>>> How about three properties being needed:
>>>
>>> == TEST 11
>>> == FROM
>>> <alice> <knows >
>>>       [ <first> "Bob", <middle> "A", <last> "Smith" ],
>>>       [ <first> "Bob", <middle> "B", <last> "Smith" ],
>>>       [ <first> "Bob", <middle> "A", <last> "Jones" ],
>>>       [ <first> "Bob", <middle> "B", <last> "Jones" ],
>>>       [ <first> "Charlie", <middle> "A", <last> "Smith" ],
>>>       [ <first> "Charlie", <middle> "B", <last> "Smith" ],
>>> == TO
>>> == END
>>>
>>> Then I guess we'd use something like:
>>>
>>> Bind ?BobASmith <alice> /<knows>[/<first> = "Bob"][/<middle> =
>>> "A"][/<last>
>>> = "Jones"]!
>>>
>>> Right?
>>
>> I used your bind in test11 and it worked right away :-)
>
>
> :-)
>
>         - s
>
>
>>
>> Alexandre
>>
>>>         - s
>>>
>>>
>>> All tests that are not about pathological graphs pass with the
>>> implementation matching the current spec.
>>> It works both with Jena, and Sesame, and the pure Scala implementation.
>>>
>>> Please read the comments inside the code for more information.
>>>
>>> Alexandre
>>>
>>> [1]
>>>
>>> https://github.com/betehess/banana-rdf/blob/d49e417c6c5560a6a852e587318da4493d25f5b1/ldpatch/src/test/scala/SandrosChallenge.scala
>>>
>>>
>>>        -- Sandro
>>>
>>>
>>> ==TEST 1
>>> ==FROM
>>> <alice> <knows> <bob>, <charlie>.
>>> ==TO
>>> <alice> <knows> <bob>, <dave>.
>>> ==END
>>>
>>>
>>> ==TEST 2
>>> ==FROM
>>> <alice> <knows> ( <bob> <charlie> )
>>> ==TO
>>> <alice> <knows> ( <bob> <dave> )
>>> ==END
>>>
>>>
>>>
>>> ==TEST 3
>>> ==FROM
>>> <alice> <knows> [ <knows> <bob> ], [<knows> <charlie>].
>>> ==TO
>>> <alice> <knows> [ <knows> <bob> ], [<knows> <dave>].
>>> ==END
>>>
>>> ==TEST 4
>>> ==FROM
>>> <alice> <knows>
>>>     [ <name> "Bob" ],
>>>     [ <name> "Charlie"].
>>> ==TO
>>> <alice> <knows>
>>>     [ <name> "Bob" ],
>>>     [ <name> "Dave"].
>>> ==END
>>>
>>> ==TEST 5 (two changes: the secound count, and the second street addr)
>>> ==FROM
>>> [ a <Order>;
>>>    <items> (
>>>       [ <code> "4343"; <count> 1 ]
>>>       [ <code> "4344"; <count> 3 ]
>>>       [ <code> "4347"; <count> 3 ]
>>>    );
>>>    <shipTo> [
>>>       a <Address>;
>>>       <street> [ <num> 32; <name> "Vassar St" ];
>>>       <city> "Cambridge";
>>>       <state> "MA";
>>>       <zip> 02139
>>>    ];
>>>    <billTo> [
>>>       a <Address>;
>>>       <street> [ <num> 32; <name> "Vassar St" ];
>>>       <city> "Cambridge";
>>>       <state> "MA";
>>>       <zip> 02139
>>>    ]
>>> ].
>>> ==TO
>>> [ a <Order>;
>>>    <items> (
>>>       [ <code> "4343"; <count> 1 ]
>>>       [ <code> "4344"; <count> 2 ]
>>>       [ <code> "4347"; <count> 3 ]
>>>    );
>>>    <shipTo> [
>>>       a <Address>;
>>>       <street> [ <num> 32; <name> "Vassar St" ];
>>>       <city> "Cambridge";
>>>       <state> "MA";
>>>       <zip> 02139
>>>    ];
>>>    <billTo> [
>>>       a <Address>;
>>>       <street> [ <num> 36; <name> "Vassar St" ];
>>>       <city> "Cambridge";
>>>       <state> "MA";
>>>       <zip> 02139
>>>    ]
>>> ].
>>> ==END
>>>
>>>
>>>
>>> ==TEST 6
>>> ==FROM
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 1
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 2
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 3
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 4
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 5
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 6
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 7
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 8
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 9
>>> ]]]]]]]]].
>>> ==TO
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 1
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 2
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 3
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 4
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 5
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 6
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 7
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 8
>>> ]]]]]]]]].
>>> <node> <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> [ <p> 0
>>> ]]]]]]]]].
>>> ==END
>>>
>>> ==TEST 7
>>> ==FROM
>>> _:x <a> _:y.
>>> _:y <a> _:z.
>>> _:z <a> _:x.
>>> ==TO
>>> _:x <a> _:y.
>>> _:y <a> _:z.
>>> _:x <a> _:z.
>>> ==END
>>>
>>> ==TEST 8
>>> ==FROM
>>> <node> <p> [ <p> [ <p> [ <p> "1" ],
>>>                         [ <p> "1" ]] ,
>>>                   [ <p> [ <p> "1" ]]],
>>>             [ <p> [ <p> [ <p> "1" ]  ,
>>>                         [ <p> "1" ]],
>>>             [ <p> [ <p> [ <p> "1" ]]].
>>> ==TO
>>> <node> <p> [ <p> [ <p> [ <p> "1" ],
>>>                         [ <p> "1" ]] ,
>>>                   [ <p> [ <p> "1" ]]],
>>>             [ <p> [ <p> [ <p> "1" ]  ,
>>>                         [ <p> "1" ], [ <p> "1" ],
>>>             [ <p> [ <p> [ <p> "1" ]]].
>>> ==END
>>>
>>>
>>>
>
Received on Tuesday, 12 August 2014 19:16:12 UTC