Re: Bnodes in DELETE templates (was: SPARQL Update 1.1 review part1)

On 23 February 2011 17:19, Axel Polleres <axel.polleres@deri.org> wrote:
> Hi Birte,
>
> thanks for your thoughts, some questions inline!
>
> On 23 Feb 2011, at 16:45, Birte Glimm wrote:
>
>> Hm, so far I had imagined a relatively simple definition, which is, we
>> evaluate the WHERE clause to get a solution sequence. We apply each
>> solution to the template, throwing away any instantiations that are
>> not RDF (e.g., literal in subj position or unbound var). Now let G be
>> the graph to which the delete applies. Each subgraph G' of G such that
>> an instantiated template is *an instance* of G' is then removed from
>> G.
>
> This sounds to me though that then bnodes from the data would
> then also be treated as wildcards, wouldn't they?

I don't think so, you map the template onto the graph with bnodes in
the templates being wildcards.

>> That's it. That's how I understood bnodes act as wild-cards. This
>> might be a bit more complicated with named graphs and quads, but still
>> the idea seems quite straightforward.
>>
>> If you want to imlement it via rewriting that's fine. Instead of
>> finding instances (which requires computing mappings for the bnodes in
>> the template), you can then just delete syntactically equal triples,
>> but I don't see why this has to be defined in the algebra or as some
>> for of rewriting algorithm.
>
> the question is how to define it then? (the rewriting is not optimal, I agree)

I haven't read the update spec, so maybe I cannot judge properly what
is really needed there. I would imagine something along the lies of
normal BGP evaluation, but only requiring an RDF instance mapping
instead of a solution mapping and an RDF instance mapping. But that
obviously implies (as Andy also mentioned) that you do some kind of
BGP evaluation to find the triples that are to be deleted and would
have to think in more detail how that could properly be defined (but I
don't have time to do that).

>> I think that is even dangerous since it
>> assumes that the bnodes that I return as bindings are really those
>> from the graph, but I am in no other place of the spec oblidged to do
>> that. If you don't want to implement via rewriting, you can (have to)
>> find mappings for the bnodes in the instantiated template to find the
>> triples that are to be deleted.
>
> if we assume the scoping graph to be identical to the graph here, then bnodes
> from the data are actually to be treated as constants.

Yes, if we make that assumption.

>> The only problem I can see is that we have to decide whether to delete
>> "iteratively" (solution by solution), which seems not to be good:
>> E.g.,
>> data:
>> :a :b :c1 .
>> :a :b :c2 .
>> a :d :e .
>> :a :f :c1 .
>> :a :f :c2 .
>> :a :g :e .
>> query
>> DELETE { :a :f ?c . :a :g ?e } WHERE { :a :b ?c . :a :d ?e }
>> evaluating the query pattern gives two solutions
>> ?c/:c1, ?e/:e and ?c/:c2, ?e/:e
>> If now instantiate the template with the first solution and
>> immediately remove that graph equivalent triples, we are left with
>> data':
>> :a :b :c1 .
>> :a :b :c2 .
>> a :d :e .
>> :a :f :c2 .
>> If we now instantiate the templete with the second solution to
>> :a :f :c2 . :a :g :e
>> There is no longer a matching subgraph left. Depending on the order, I
>> could then end up with different graphs, which is not good at all. One
>> would have to first find all matching triples and then remove them,
>> but this is not really related to the bnode issue.
>
> my understanding is that you don't remove per solution and then match again, but per the set of all pattern solutions.
> (I understand we agree here)
> that's what the function Dataset(modify_template,P) wanted to say, which needs to be defined better.

Ok.

>
>>
>> I can also live with omitting triples with unbound variables, as Steve
>> suggested, but I am not fond of prescribing this quite ugly rewriting.
>> If I have to choose a rewriting, then it would be option 1, which to
>> me seems just a way of implementing instance finding.
>>
>> I also comment inline on your answer below...
>>
>> On 22 February 2011 22:26, Axel Polleres <axel.polleres@deri.org> wrote:
>> > Hi Birte,
>> > On 22 Feb 2011, at 18:12, Birte Glimm wrote:
>> >
>> >> On 22 February 2011 03:53, Paul Gearon <gearon@ieee.org> wrote:
>> >> [snip]
>> >>
>> >> > DELETE
>> >> >  { _:a :p 12 .
>> >> >   _:a :q ?o .
>> >> >  }
>> >> > WHERE {?s :r ?q OPTIONAL { ?q :s ?o } }
>> >> >
>> >> > So when that is applied to the following data:
>> >> >
>> >> > :x :p 12 .
>> >> > :x :r :z .
>> >> > :x :q 10 .
>> >> >
>> >> > The model you've proposed will certainly match the first triple in the
>> >> > template for removal, but the ?o will be unbound, so what happens to
>> >> > that part of the template? Does it get ignored, or does it prevent the
>> >> > entire template from matching? (The latter is more in line with
>> >> > CONSTRUCT and INSERT, but doesn't really work well with the proposed
>> >> > formal model).
>> >>
>> >> In Lee's original discussion of the bnode issue
>> >> (http://lists.w3.org/Archives/Public/public-rdf-dawg/2010JanMar/0428.html),
>> >> he writes:
>> >> "And note that:
>> >>    DELETE { ?unbound :p :o } WHERE { }
>> >> ...doesn't remove anything at all. ?unbound is (as its name implies) not
>> >> bound in the single solution; when this solution is applied to the
>> >> template, we get an invalid triple (because of the unbound variable),
>> >> and nothing is removed. (The current editor's draft does not spell this
>> >> out, but this is the analogous behavior to CONSTRUCT templates, and I
>> >> assume we have consensus around this.)"
>> >>
>> >> Unless this has changed, also the above example shouldn't remove
>> >> anything because ?o is unbound.
>> >
>> > For the records,
>> >
>> > version 1 (called option1 in today's TC) of the semantics wouldn't delete anything, on the triple
>> >  _:a :q ?o.
>> > in the query above, but it would delete something on the triple
>> >  _:a :p 12 .
>> >
>> > version 2 (called option1 in today's TC, the one where the DELETE template is copied in the body) though, would match something
>> >  on both triples in the template, which sounds like something you'd object against, so?
>>
>> I guess here you mean "called option****2**** in today's TC". And,
>> yes, I don't like that.
>> >
>> > version 3 (called option 3 in today's TC, i.e. treating bnodes the same in DELETE as in CONSTRUCT, which would obviously not DELETE anything,
>> > since to-be-deleted triples with new bnodes don't appear in GS)
>>
>> This seems a bit pointless. You could then also just disallow bnodes
>> in the template.
>
> That would change the grammar though, and is restricting something which we don't have to restrict.

Yes, but otherwise you might wonder why nothing is being deleted, but
I guess you just have to learn that bnodes in the template won't get
you anywhere.

>>
>> > BTW, are there other alternatives which we didn't mention so far?
>>
>> I am ok with the effect of option 1, but ouldn't prescribe this
>> particular implementation technique as argued above.
>
> ok, I'd be more than happy with an alternative definition,
> proposals welcome (probably won't have bandwidth to work it out myself in the next two weeks, unfortunatelty).

Unfortunately I won't have the time either, but at least now I
understand better what issues arise.

Birte

>
> Thanks for the input!!!!
>
> Axel
>
>>
>> Birte
>>
>> >>
>> >> However, Lee also wrote
>> >> "=== Behavior of DELETE ===
>> >> The full form of DELETE is
>> >>    DELETE { template } WHERE { pattern }
>> >> The semantics are that pattern is matched against the graph store,
>> >> yielding a solution set (a set of solutions; each solution is a set of
>> >> variable bindings; each variable binding is a pairing of a variable plus
>> >> an RDF term). Each solution in the solution set is then inserted into
>> >> the template in turn, resulting in ground triples. Each of these ground
>> >> triples is removed from the graph store (either from the graph store's
>> >> single "unnamed graph", or from the graph specified in the WITH clause,
>> >> or from the graph specified inline with a GRAPH clause."
>> >>
>> >> In particular "resulting in ground triples" is just not true if the
>> >> template has bnodes because the bnodes are not changed by applying the
>> >> solution mapping and any triple with bnodes in it is not ground, see
>> >> RDF Semantics http://www.w3.org/TR/rdf-mt/#graphdefs Section 0.3: "A
>> >> ground RDF graph is one with no blank nodes."
>> >>
>> >> If we really would only delete ground triples, then we won't have
>> >> problems and I have to say that I am very inclined to suggest sticking
>> >> to that, i.e., if applying the solution mapping does not yield ground
>> >> triples, nothing is being deleted.
>> >>
>> >
>> > Lee to answer, but I assume that he didn't consider bnodes coming from the bindings in "ground" here.
>> > I assume that, because I think we want to be able to delete triples with bnodes (otherwise, any triple with a bnode can only be removed by
>> > dropping/clearing the graph. Anyways, this seems indeed an issue also with regards to Andy's previous mail... i.e. the issue around
>> > what is the scoping graph for matching the WHERE part in update requests... It seems we want to be able to
>> > DELETE triples with bnodes, with the current definition of scoping graph from Query, see
>> >  http://lists.w3.org/Archives/Public/public-rdf-dawg/2011JanMar/0328.html
>> > this is probably not possible, so we'd need to define the scoping graph differently (preserving bnode labels?) for Update.
>> >
>> > Axel
>> >
>> >> This has the disadvantage that deleting lists is not that easy, but
>> >> apart from that no complicated rewritings and query pattern extensions
>> >> are required.
>> >
>> >> Birte
>> >>
>> >> > Is each part of the template treated as if it is a separate thing to
>> >> > be deleted, or does the re-use of the same blank node label (_:a in
>> >> > this case) have some kind of effect?
>> >> >
>> >> > Regards,
>> >> > Paul
>> >> >
>> >> > On Mon, Feb 21, 2011 at 3:10 AM, Axel Polleres <axel.polleres@deri.org> wrote:
>> >> >> Thanks Greg for spotting this! Actually, there were two mistakes in the P_B
>> >> >> pattern below...  see inline.
>> >> >>
>> >> >>
>> >> >> On 20 Feb 2011, at 23:40, Gregory Williams wrote:
>> >> >>
>> >> >> On Feb 18, 2011, at 1:31 PM, Axel Polleres wrote:
>> >> >>
>> >> >>> For any unnamed bnode _:B in a modify_template_DEL
>> >> >>> (i) new variables ?Var_B ?Var_B1 ?Var_B2 ?Var_Bg are introduced,
>> >> >>> (ii) _:B is replaced by ?Var_B in a modify_template_DEL,
>> >> >>> (iii) Pattern P is replaced by P_B such that
>> >> >>>
>> >> >>>  P_B =  { P } UNION {
>> >> >>>          SELECT DISTINCT ?Var_B
>> >> >>>            {  { ?Var_B ?Var_B1 ?Var_B2 } UNION
>> >> >>>               { ?Var_B1 ?Var_B ?Var_B2 } UNION
>> >> >>>               { ?Var_B1 ?Var_B2 ?Var_B } UNION
>> >> >>>               { GRAPH ?Var_Bg {?Var_B1 ?Var_B2 ?Var_B } } UNION
>> >> >>>               { GRAPH ?Var_Bg {?D1 ?Var_B2 ?Var_B } } UNION
>> >> >>>               { GRAPH ?Var_Bg {?Var_B1 ?Var_B2 ?Var_B } } }
>> >> >>>
>> >> >>> That is, ?Var_B binds to all possible values in the vocabulary of GS.
>> >> >>
>> >> >> I'm not totally swapped in on this issue, but I don't understand how this
>> >> >> pattern aligns with the description. In all three named graph union
>> >> >> branches, ?Var_B appears in the object position. Since ?Var_B is the only
>> >> >> projected variable, don't these three branches return the same results?
>> >> >>
>> >> >> yes, that was a copy-paste error from an earlier version I had. The correct
>> >> >> one should be
>> >> >>     P_B =  { { P } { Q } }
>> >> >> i.e. join, not UNION (thanks Andy, for pointing me to that offlist), where Q
>> >> >> is defined as below...
>> >> >>
>> >> >>> I know that the definition of P_B doesn't look very nice, but this
>> >> >>> definition should cover the intended semantics of [2].
>> >> >>> In principle, the idea is that bnodes, that should behave as wildcard
>> >> >>> should bind to *any* element in the signature of GS,
>> >> >>> which is what is returned by the subquery
>> >> >>>        Q = SELECT DISTINCT ?D
>> >> >>>            {  { ?D ?D1 ?D2 } UNION
>> >> >>>               { ?D1 ?D ?D2 } UNION
>> >> >>>               { ?D1 ?D2 ?D } UNION
>> >> >>>               { GRAPH ?Dg {?D1 ?D2 ?D } } UNION
>> >> >>>               { GRAPH ?Dg {?D1 ?D2 ?D } } UNION
>> >> >>>               { GRAPH ?Dg {?D1 ?D2 ?D } } }
>> >> >>
>> >> >> ...corrected version of Q:
>> >> >>
>> >> >>         Q = SELECT DISTINCT ?Var_B
>> >> >>
>> >> >>             {  { ?Var_B ?Var_B1 ?Var_B2 } UNION
>> >> >>
>> >> >>                { ?Var_B1 ?Var_B ?Var_B2 } UNION
>> >> >>
>> >> >>                { ?Var_B1 ?Var_B2 ?Var_B } UNION
>> >> >>
>> >> >>                { GRAPH ?Var_Bg {?Var_B ?Var_B1 ?Var_B2 } } UNION
>> >> >>
>> >> >>                { GRAPH ?Var_Bg {?Var_B1 ?Var_B ?Var_B2 } } UNION
>> >> >>                { GRAPH ?Var_Bg {?Var_B1 ?Var_B2 ?Var_B } } }
>> >> >>
>> >> >> Even more obvious in this one. Am I missing something?
>> >> >>
>> >> >>
>> >> >> Hope I got it right now ;-)
>> >> >>
>> >> >> Thanks,
>> >> >> Axel
>> >> >>
>> >> >> thanks,
>> >> >> .greg
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Dr. Birte Glimm, Room 309
>> >> Computing Laboratory
>> >> Parks Road
>> >> Oxford
>> >> OX1 3QD
>> >> United Kingdom
>> >> +44 (0)1865 283520
>> >>
>> >
>> >
>> >
>>
>>
>>
>> --
>> Dr. Birte Glimm, Room 309
>> Computing Laboratory
>> Parks Road
>> Oxford
>> OX1 3QD
>> United Kingdom
>> +44 (0)1865 283520
>>
>
>



-- 
Dr. Birte Glimm, Room 309
Computing Laboratory
Parks Road
Oxford
OX1 3QD
United Kingdom
+44 (0)1865 283520

Received on Thursday, 24 February 2011 21:12:20 UTC