Re: Example updates from Paul Gearon on 2009-12-03 (public-rdf-dawg@w3.org from October to December 2009)

From: Paul Gearon <gearon@ieee.org>
Date: Thu, 3 Dec 2009 12:12:37 -0500
To: Andy Seaborne <andy.seaborne@talis.com>
Cc: Steve Harris <steve.harris@garlik.com>, SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <a25ac1f0912030912l34ce1f65r7ad76651d016e6e8@mail.gmail.com>
I usually try to cut out whatever isn't in the context of what I'm
replying to, but I'm not that will make this email any clearer. So
apologies if you have trouble finding my responses....

On Thu, Dec 3, 2009 at 6:18 AM, Andy Seaborne <andy.seaborne@talis.com> wrote:
>
>
> On 02/12/2009 11:32, Steve Harris wrote:
>>
>> On 1 Dec 2009, at 21:06, Paul Gearon wrote:
>
> Paul - these examples are very useful for me to understand the design space.
>
>>
>>> The main question on updates seems to be around referring to multiple
>>> graphs, so these operations are based on that idea. I'll be using some
>>> common prefixes (foaf:, contact:), so I hope no one minds if I don't
>>> define them.
>>>
>>> I can't recall if I'm supposed to be writing according to a specific
>>> syntax, so I'll offer a couple of variations.
>>>
>>> The first case has one source for pattern matching, and one
>>> destination for insertion.
>>> This example copies the contents of one graph into another.
>>>
>>> (1) The original form would be:
>>>
>>> INSERT INTO <destination> { ?s ?p ?o }
>>> WHERE { GRAPH <source> { ?s ?p ?o } }
>>>
>>> (2) With the recent changes:
>>>
>>> WITH <source>
>>> INSERT INTO <destination> { ?s ?p ?o }
>>> WHERE { ?s ?p ?o }
>>>
>>>
>>> (3) Or using GRAPH instead of INTO:
>>>
>>> WITH <source>
>>> INSERT { GRAPH <destination> { ?s ?p ?o } }
>>> WHERE { ?s ?p ?o }
>>
>> So, in this case wouldn't INSERT ... FROM be a perfectly reasonable
>> thing to write? It would have the same meaning as in SELECT, wouldn't it?
>>
>>> (4) Then dropping WITH:
>>>
>>> INSERT { GRAPH <destination> { ?s ?p ?o } }
>>> WHERE { GRAPH <source> { ?s ?p ?o } }
>>>
>>> (The last would be valid even if WITH is adopted)
>
> So the original form is also legal and the same?

I believe so, yes.


>>> ===
>>>
>>> This example has 2 sources for pattern matching and 1 destination for
>>> insertion.
>>>
>>> To create entries for people in a new graph if they live in London and
>>> have an email address stored in a separate graph:
>>>
>>> (1)
>>> INSERT INTO <people_graph> { ?person a foaf:Person }
>>> WHERE {
>>> GRAPH <address_graph> {
>>> ?person contact:home [ contact:city "London" ]
>>> } . GRAPH <mail_graph> {
>>> ?person foaf:mbox ?mail
>>> }
>>> }
>>>
>>> (2)
>>> WITH <address_graph>
>>> INSERT INTO <people_graph> { ?person a foaf:Person }
>>> WHERE {
>>> ?person contact:home [ contact:city "London" ] .
>>> GRAPH <mail_graph> { ?person foaf:mbox ?mail }
>>> }
>>>
>>> (3)
>>> WITH <address_graph>
>>> INSERT { GRAPH <people_graph> { ?person a foaf:Person } }
>>> WHERE {
>>> ?person contact:home [ contact:city "London" ] .
>>> GRAPH <mail_graph> { ?person foaf:mbox ?mail }
>>> }
>>>
>>> (4)
>>> INSERT { GRAPH <people_graph> { ?person a foaf:Person } }
>>> WHERE {
>>> GRAPH <address_graph> {
>>> ?person contact:home [ contact:city "London" ]
>>> } . GRAPH <mail_graph> {
>>> ?person foaf:mbox ?mail
>>> }
>>> }
>
> If WITH is required, the parsing problems go away.  It's not exactly the
> same as MODIFY - it's better to have the URI introduced to span the WHERE as
> well.  But isn't this just modified MODIFY now?

I don't like the idea of requiring WITH. Mulgara's TQL language has an
imposition like this, and it's annoying. You end up having to
arbitrarily choose the graph that you're working with, despite
everything following referring to graphs explicitly. Not that it's a
huge deal, but it does bug me.

My understanding is that MODIFY refers to the graph to be updated. I
know that the original document says:
  MODIFY [ <uri> ]*
allowing for multiple URIs, but that means that INSERT/DELETE
operations are duplicated for each of those URIs, right? The aim of
this is to allow for selection from multiple graphs, and modification
of multiple graphs, while MODIFY seems oriented around just one graph
(though GRAPH can of course by used in the WHERE clause).

>>> ===
>>>
>>> This example has multiple sources for pattern matching, multiple
>>> sources for deletion, and a single destination for insertion.
>>>
>>> To delete email addresses from graphs a, b, and c, and insert them
>>> into a graph called email_graph:
>
> What is the correct terminology for one these WITH..(INSERT|DELETE)*..WHERE
> things?
>
> Suggestion: an "operation" is the unit (includes all the other units in teh
> update language) and "request" is zero or more "operations" sent by the
> client.

I've been thinking "operation". I also made up the term "directive" to
refer to a single INSERT or DELETE for those grammars that allow
multiple instances of these terms in a single operation. As I said in
a previous email, I had not considered SPARQL capable of multiple
operations in a single send/receive by a client (even though Mulgara's
TQL does permit this). I'm happy to call this a "request" as it's the
term I tend to use anyway.

>>> (1) can't be done in one step
>>>
>>> (2)
>>> # WITH is being used on the destination, but could be applied elsewhere
>>> WITH <email_graph>
>>> DELETE FROM <a> { ?person foaf:mbox ?email };
>>> DELETE FROM <b> { ?person foaf:mbox ?email };
>>> DELETE FROM <c> { ?person foaf:mbox ?email };
>>> INSERT { ?person foaf:mbox ?email }
>>> WHERE {
>>> GRAPH <a> {?person foaf:mbox ?email}
>>> UNION GRAPH <b> {?person foaf:mbox ?email}
>>> UNION GRAPH <c> {?person foaf:mbox ?email}
>>> }
>>>
>>> (looks like a good case for FROM NAMED)
>>>
>>> (3)
>>> WITH <email_graph>
>>> DELETE { GRAPH <a> { ?person foaf:mbox ?email } };
>>> DELETE { GRAPH <b> { ?person foaf:mbox ?email } };
>>> DELETE { GRAPH <c> { ?person foaf:mbox ?email } };
>>> INSERT { ?person foaf:mbox ?email }
>>> WHERE {
>>> GRAPH <a> {?person foaf:mbox ?email}
>>> UNION GRAPH <b> {?person foaf:mbox ?email}
>>> UNION GRAPH <c> {?person foaf:mbox ?email}
>>> }
>>
>> Wouldn't
>>
>> DELETE {
>> GRAPH <a> { ?person foaf:mbox ?email }
>> GRAPH <b> { ?person foaf:mbox ?email }
>> GRAPH <c> { ?person foaf:mbox ?email }
>> }
>> INSERT { ?person foaf:mbox ?email }
>> WHERE {
>> GRAPH <a> {?person foaf:mbox ?email}
>> UNION GRAPH <b> {?person foaf:mbox ?email}
>> UNION GRAPH <c> {?person foaf:mbox ?email}
>> }
>>
>> do the job?
>>
>>> (4)
>>> DELETE { GRAPH <a> { ?person foaf:mbox ?email } };
>>> DELETE { GRAPH <b> { ?person foaf:mbox ?email } };
>>> DELETE { GRAPH <c> { ?person foaf:mbox ?email } };
>>> INSERT { GRAPH <email_graph> {?person foaf:mbox ?email } }
>>> WHERE {
>>> GRAPH <a> {?person foaf:mbox ?email}
>>> UNION GRAPH <b> {?person foaf:mbox ?email}
>>> UNION GRAPH <c> {?person foaf:mbox ?email}
>>> }
>>>
>>>
>>> Alternatively, instead of the union it could be something like:
>>> DELETE { GRAPH <?g> { ?person foaf:mbox ?email } };
>>> INSERT { GRAPH <email_graph> {?person foaf:mbox ?email } }
>>> WHERE {
>>> GRAPH <?g> {?person foaf:mbox ?email} .
>>> FILTER (?g == <a> || ?g == <b> || ?g == <c>)
>>> }
>>
>> s/<?g>/?g/
>
>
> Overall, it may be better as several operations: do the inserts then do the
> deletes (assuming a/b/c != email_graph :-)
>
> An example of this use case:
>
> WITH <a> <b> <c>
> INSERT INTO <email_graph> { ?person foaf:mbox ?email }
> WHERE { ?person foaf:mbox ?email}
>
> DELETE FROM <a> { ?person foaf:mbox ?email }
> DELETE FROM <b> { ?person foaf:mbox ?email };
> DELETE FROM <c> { ?person foaf:mbox ?email };
>
> What's the limit of what can be done in a single oepration?

My understanding was that there should not be a limit.

> If we want to be able to do the full range of changes in a single request,
> let along operation, we may need to have temporary graphs or temporary
> places to put the result of WHERE (c.f. Eric's FeDeRate) so they are can be
> reused
>
> Another route is to not assume every possible change can be done in one go
> and place the responsibility on the client with multiple changes.
>
> While that may seem at first sight a bad move, (atomicity etc) I thin that
> in practice, very complex operations will not be formulated into some
> amazing request but done step-by-step with inspection of the effects so far.

This is simpler, and to start with, I was expecting that this is what
we'd be doing. I then expected individual implementations to have
extensions for transactions. However, with the feedback at ISWC, I was
left with the impression that we either needed transactions, or the
ability to allow for multiple updates in a single step. The latter
doesn't solve the problems of transactionality, but it gets you part
of the way.

Anyway, it was with these needs in mind that I believe the multiple
update proposal was first mooted. While it wasn't my own suggestion, I
do think that we need something.

>>> This example has a single source for pattern matching, and multiple
>>> destinations for insertion:
>>>
>>> (1) not possible
>>>
>>> (2)
>>> WITH <people_graph>
>>> INSERT INTO <email_graph> {?person foaf:mbox ?email };
>>> INSERT INTO <name_graph> {?person foaf:name ?name }
>>> WHERE {
>>> ?person a foaf:Person
>>> OPTIONAL { ?person foaf:mbox ?email }
>>> OPTIONAL { ?person foaf:name ?name }
>>> }
>>
>> Surely variable bindings don't span the ; ?
>
> That was my understanding as well.

I already mentioned, the idea was to allow for multiple updates in one
operation. There *is* a restriction that only one WHERE clause can be
used (a real pain if you want to update unrelated data, as you need to
create a cartesian product with the WHERE clause), so the variables
defined there would have to be available to everything in the
operation.


>>> (Have we decided what happens with an unbound value? That entry gets
>>> skipped? I'm guessing so, but I don't recall it being addressed)
>>
>> That's what happens in CONSTRUCT, so it seems logical.
>>
>>> (3)
>>> WITH <people_graph>
>>> INSERT { GRAPH <email_graph> {?person foaf:mbox ?email } };
>>> INSERT { GRAPH <name_graph> {?person foaf:name ?name } }
>>> WHERE {
>>> ?person a foaf:Person
>>> OPTIONAL { ?person foaf:mbox ?email }
>>> OPTIONAL { ?person foaf:name ?name }
>>> }
>>>
>>> (4)
>>> INSERT { GRAPH <email_graph> {?person foaf:mbox ?email } };
>>> INSERT { GRAPH <name_graph> {?person foaf:name ?name } }
>>> WHERE {
>>> GRAPH <people_graph> {
>>> ?person a foaf:Person
>>> OPTIONAL { ?person foaf:mbox ?email }
>>> OPTIONAL { ?person foaf:name ?name }
>>> }
>>> }
>>
>> Again the ; here is confusing. It seems like you only need one INSERT
>> block if you have the GRAPH syntax.
>>
>> - Steve
>
> None of these example motivate the need for multiple INSERT/DELETE per
> operation and in particualr in mixed order.  I can't find any discussion at
> the F2F - it just appears at one point with no rationale noted.
>
> Could someone say why they are needed?  Example?

I see no reason for mixed order myself (though that may just be
because I can't think of a counter example).

The idea of multiple INSERT/DELETE was for consistency of related data
structures that may span more than one graph. FOAF data probably
wasn't appropriate for examples in this case. I believe that something
exposed through D2R would demonstrate this better.

> An alternative is to put all the DELETEs together and then all the INSERTS.
>  Why would one want to have a later DELETE affect an earlier INSERT in the
> same operation?
>
> DELETE { all the deletes - multiple BGPs/GRAPH }
> INSERT { all the inserts - multiple BGPs/GRAPH }
> WHERE { }
>
> Whether it's GRAPH or INTO/FROM to direct to a specific graph is anotehr
> issue. (GRAPH is the only proposal I've seem that uses teh same word for
> INSERT and DELETE).

I'm OK with what you're saying here. (I didn't suggest it in the first
place, did I? I thought I was just tagging along for the ride on that
one).

> In the log of F2F2:
>
> PROPOSED: use GRAPH inside insert/delete templates instead of FROM/INTO
> (subject to approval from Update Editors.)
>
> (the "subject ..." were added to proposals because I asked that the update
> editors in the loop).
>
> (these were not RESOLVED -- due to numbers present?)
>
> E.g. Steve's example. my indentation, above:

The more I look at it, the more that:
  INSERT { GRAPH <uri> {....}}
makes sense to me, even though I didn't like it to start with. My
initial objection was that it was diverging further from the familiar
ground of SQL, and also because it made it look very "quad" centric.
That bothered me because while SPARQL has always acknowledged quads,
they seem to be the syntactic exception, rather than the norm.

But now that GRAPH is showing to be more versatile, and avoids some of
these multiple INSERT/DELETE issues, then I'm starting to lean that
way instead. I'm also satisfied that a WITH or USING at the top allows
the rest of the operation to look triple-centric.

>> Wouldn't
>>
>> DELETE {
>>   GRAPH <a> { ?person foaf:mbox ?email }
>>   GRAPH <b> { ?person foaf:mbox ?email }
>>   GRAPH <c> { ?person foaf:mbox ?email }
>> }
>> INSERT { ?person foaf:mbox ?email }
>> WHERE {
> ...
>> do the job?

As I said, I had my head in the wrong place when first looking at
these, and viewed GRAPH as a simple syntactic variant of INTO or FROM.
So yes, you're right, and I need to re-write the examples.

Paul
Received on Thursday, 3 December 2009 17:13:20 UTC