Re: DELETE and blank nodes

I agree with Lee, the last example of those two queries is just too  
strange.

This way we could leave it to a future (braver, or more widely  
chartered) WG to sort out the bnode tangle.

- Steve

Sent on the move.

On 3 Mar 2010, at 06:16, Lee Feigenbaum <lee@thefigtrees.net> wrote:

> This email is in reference to http://www.w3.org/2009/sparql/track/actions/201 
>  . It summarizes the discussion of blank nodes in DELETE and  
> concludes with what I see as two reasonable proposals. Skip to the  
> end if all you are interested in are the proposals.
>
> == The Problem ==
>
> The question under consideration is: What are the semantics of blank  
> nodes that appear within the template part of a DELETE statement?
>
> == Bakcground ==
>
> === Blank nodes in query patterns ===
>
> In SPARQL 1.0 query patterns, a blank node acts as a non- 
> distinguished variable. It binds to graph terms just like variables,  
> but it can't be projected, filtered, sorted, etc. Blank nodes with a  
> given label (e.g. _:b123) can only be used within a single basic  
> graph pattern (BGP) (to do otherwise results in an invalid query  
> string).
>
> === Blank nodes in CONSTRUCT templates ===
>
> Blank nodes in the template of CONSTRUCT queries emit blank nodes  
> into the result RDF. The CONSTRUCT template preserves blank node  
> label coreferences within the RDF generated from a single solution,  
> but emits new blank nodes for each solution to which the template is  
> applied.
>
> === Blank nodes in INSERT templates ===
>
> Blank nodes in INSERT statement templates are as in CONSTRUCT  
> templates. the SPARQL 1.1 Update draft currently says "The template  
> and pattern forms are as defined in SPARQL for construct templates  
> and graph patterns."
>
> === Behavior of DELETE ===
>
> The full form of DELETE is
>
>  DELETE { template } WHERE { pattern }
>
> The semantics are that pattern is matched against the graph store,  
> yielding a solution set (a set of solutions; each solution is a set  
> of variable bindings; each variable binding is a pairing of a  
> variable plus an RDF term). Each solution in the solution set is  
> then inserted into the template in turn, resulting in ground  
> triples. Each of these ground triples is removed from the graph  
> store (either from the graph store's single "unnamed graph", or from  
> the graph specified in the WITH clause, or from the graph specified  
> inline with a GRAPH clause.
>
> In particular, for our purposes, note that:
>
>  DELETE { :a :ground :triple } WHERE { }
>
> ...deletes the given triple. The solution set for the empty WHERE  
> clause is one solution with no bindings - that solution gets applied  
> to the template which yields the ground triple which is removed from  
> the unnamed graph.
>
> And note that:
>
>  DELETE { ?unbound :p :o } WHERE { }
>
> ...doesn't remove anything at all. ?unbound is (as its name implies)  
> not bound in the single solution; when this solution is applied to  
> the template, we get an invalid triple (because of the unbound  
> variable), and nothing is removed. (The current editor's draft does  
> not spell this out, but this is the analogous behavior to CONSTRUCT  
> templates, and I assume we have consensus around this.)
>
> We also have the shortcut form:
>
>  DELETE WHERE { limited-pattern }
>
> limited-pattern can have GRAPH clauses and triple patterns. The  
> draft doesn't yet spell this out, but I believe the current  
> understanding is that this is purely syntactic sugar, as in:
>
>  DELETE WHERE { X } === DELETE { X } WHERE { X }
>
> So, DELETE WHERE { ?s :p :o } is equivalent to
>
>  DELETE { ?s :p :o } WHERE { ?s :p :o }
>
> == The Options for Blank Nodes in Delete Templates ==
>
> All of which brings us to the topic at hand.
>
> What does a blank node mean in a delete template? At the  
> teleconference, we discussed three options. Here they are, with  
> analysis gleaned from the discussion:
>
> 1/ Blank nodes are not allowed in DELETE templates. The syntax for  
> DELETE would prohibit blank nodes from appearing in DELETE  
> templates. Most queries involving blank nodes can be written with  
> regular variables instead, and this option avoids the potential  
> confusion of the other options.
>
> 2/ Blank nodes are treated as in CONSTRUCT (and INSERT) templates.  
> In this case, a blank node in a DELETE template becomes (in the  
> ground triples) a newly minted blank node for each solution that is  
> applied to the template. Because the blank node is newly minted, it  
> does not occur in the graph store at all. The practical effect of  
> this is that a triple in the DELETE template that contains a blank  
> node would _never_ lead to _anything_ being deleted. It also means  
> that the shortcut form:
>
>  DELETE WHERE { _:b1 :p :o }
>
> ...if treated as pure syntactic sugar for:
>
>  DELETE { _:b1 :p :o } WHERE { _:b1 :p :o }
>
> ...would stand for 2 very different meanings of _:b1, which at best  
> is really confusing. At the teleconference, there did not seem to be  
> any support for this option, and I can't see any useful benefits  
> that it has other than formal consistency with CONSTRUCT and INSERT.
>
> 3/ Blank nodes in delete templates are treated similary to blank  
> nodes in query patterns--i.e. as (non-distinguished) variables. The  
> *intent* of this option is that:
>
>  DELETE { _:b1 :p :o } WHERE { }
>
> should delete _all_ triples with predicate :p and object :o. Sandro  
> gave a motivating use case for this interpretation. It would provide  
> the only reasonable way to delete an RDF list:
>
>  DELETE { ?x :hasList (1 2 3) } WHERE { ... ?x ... }
>
> (1 2 3) is syntactic sugar for an expansion involving blank nodes.  
> If those blank nodes are treated as variables, then this would  
> delete all the triples that make up the list.
>
> We had trouble writing down the precise meaning of a blank node  
> here. It's *not* just that blank nodes are the same as variables,  
> because:
>
>  DELETE { _:b1 :p :o } WHERE { }
>
> would delete all the <something> :p :o triples whereas
>
>  DELETE { ?b1 :p :o } WHERE { }
>
> would delete nothing (because ?b1 is unbound). Effectively, blank  
> nodes in the template are acting as a way to do both pattern  
> matching, variable binding and triple deleting all in one operation,  
> instead of the normal multi-phase approach.
>
> We also immediately noted confusion if the same blank node label was  
> used in the query pattern and the template:
>
>  DELETE { _:b1 :p :o } WHERE { :foo :bar _:b1 }
>
> ...the _:b1 in the WHERE clause acts as a non-distinguished variable  
> whose bindings don't contribute to the solution set, so the _:b1 in  
> the template has to be something different altogether. This is  
> rather bizarre, so we discussed 2 remedies:
>
>  A) Prohibit the same blank node label from being used in 2  
> different BGPs, or in 1 BGP and in the template. This is basically  
> the same restriction that SPARQL Query puts on blank nodes.
>
>  B) Prohibit named blank nodes completely. In the end I don't think  
> we saw much reason to prefer this to approach A.
>
> Note that the effect of A (or B) is that the shortcut:
>
>  DELETE WHERE { _:b1 :p :o }
>
> is an illegal query.
>
> == The Proposals ==
>
> I see only two realistic proposals emerging from this.
>
> 1/ We prohibit blank nodes in the DELETE template completely.
>
> 2/ Blank nodes in DELETE templates act as "wild cards"--effectively  
> variables pre-bound to all RDF terms--to let us write some shortcuts  
> and handle Sandro's case of deleting RDF lists. We prohibit the same  
> blank node label from being used in multiple scopes.
>
>
>
> == My Opinion ==
>
> While I'm sympathetic to Sandro's use case, I'm frightened of the  
> fact that:
>
>  DELETE { _:b1 :p :o } WHERE { }
> and
>  DELETE { ?b1 :p :o } WHERE { }
>
> do dramatically different things. Because of this, I'd rather we go  
> with the first proposal and prohibit blank nodes in the DELETE  
> template entirely.
>
> hope this is helpful,
> Lee
>
>
>
>
>

Received on Wednesday, 3 March 2010 08:16:38 UTC