W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > April to June 2010

Re: Negation decision : unexpected effects

From: Andy Seaborne <andy.seaborne@talis.com>
Date: Mon, 12 Apr 2010 13:31:35 +0100
Message-ID: <4BC312A7.1000406@talis.com>
To: Lee Feigenbaum <lee@thefigtrees.net>
CC: SPARQL Working Group <public-rdf-dawg@w3.org>


On 06/04/2010 9:15 PM, Lee Feigenbaum wrote:
> On 4/5/2010 4:42 PM, Andy Seaborne wrote:
>>
>>
>> On 05/04/2010 3:53 AM, Lee Feigenbaum wrote:
>>> On 4/2/2010 5:05 PM, Andy Seaborne wrote:
>>>> The decision at F2F3 to have just the form of NOT EXISTs in an explicit
>>>> FILTER has a limitation. I should have realised at the time but it
>>>> didn't occur to me until after the meeting.
>>>>
>>>> FILTERs get moved to the end of the BGP during translation from syntax
>>>> to algebra. The form without the word "FILTER" does not move e.g.
>>>>
>>>> { ?s rdf:type :T
>>>> NOT EXISTS { ?s :p ?v . }
>>>> ?s :q ?v
>>>> }
>>>>
>>>> then NOT EXISTS is not moved about by the FILTER placement rules.
>>>
>>> Yes, this is a difference. In most cases it doesn't matter, though,
>>> right? I'd like to understand better the cases in which moving/not
>>> moving
>>> the NOT EXISTS changes the answers.
>>
>> I am more concerned that is can change the answers in strange ways.
>> Because it's negation, I think the effects will be particularly strange.
>>
>> I did find an old email:
>>
>> http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JulSep/0468.html
>>
>> :-)
>
> Wow, the more things change, the more they stay the same, eh? :)
>
> So based on that mail, here is an example of the scoping issue here:
>
> Data:
>
> :Lee a foaf:Person ; :hairColor "brown" .
> :OtherLee a foaf:Person ; :hairColor "blond" . # when I was much younger
>
> Query 1A:
>
> SELECT * {
> ?s a foaf:Person .
> FILTER (NOT EXISTS { ?s :hairColor "brown" })
> }
>
> Query 1B:
>
> SELECT * {
> ?s a foaf:Person .
> NOT EXISTS { ?s :hairColor "brown" }
> }
>
> This has the same answer in both cases:
>
> { { (?s, :OtherLee) } }

Agree:

-------------
| s         |
=============
| :OtherLee |
-------------

>
> Query 2A:
>
> SELECT * {
> FILTER (NOT EXISTS { ?s :hairColor "brown" })
> ?s a foaf:Person .
> }
>
> Query 2B:
>
> SELECT * {
> NOT EXISTS { ?s :hairColor "brown" }
> ?s a foaf:Person .
> }
>
> 2A is the same as 1A because FILTERs execute at the end of the group.

Agreed.

>
> 2B, however, is algebraically something like:
>
> Join(NotExists(BGP(), BGP(?s :hairColor "brown")), BGP(?s a foaf:Person))
>
> BGP() - evaluates to the identity solution set - one row with no bindings:
> { { } }
>
> BGP(?s :hairColor "brown") evaluates to a non-empty solution set. Since
> the solutions in that solution set are compatible with the empty
> solution (the one solution in the identity solution set), this evaluates
> to no answers

There's nothing about compatibility in the definition NOT EXISTs 
(although a semi-join can be used to evaluate it).

> so you have Join({}, BGP(?s a foaf:Person)) which is empty (no solutions).


(join
   (filter (! (exists (bgp (triple ?s :hairColor "brown"))))
       (table unit))
   (bgp (triple ?s rdf:type foaf:Person)))

(filter ... ) evaluates NOT EXISTS { ?s :hairColor "brown" } which is 
false.  The pattern does exist do NOT EXISTS is false.

The filter is no rows so the join is no rows.

> I believe this difference only occurs in the case of what the Chileans
> call not-well-formed queries - a variable occurs inside the NOT EXISTS,
> but not on the left-hand side of the NOT EXISTS, and also occurs
> elsewhere in the query. (Same as with procedural vs. compositional
> OPTIONAL.)

Confused about the reference to not-well-formed queries: they are to do 
with doubley nested optionals where a variable appears in the LHS of the 
outer OPTIONAL and the RHS for the inner OPTIONAL but not in between.

There is a pattern of in the NOT EXISTs and used later.

> But what Steve was saying on today's TC (and I think I agree with) is
> ... so what? Are there any useful cases where we need the behavior of
> Query 2B? I don't know of any.

1: the syntax is shorter and more convenient to use.
2: OPTIONAL/!BOUND is already out there so maximising the transition 
seems sensible to me.

>
> ...
>
> What would this look like with OPTIONAL/!BOUND?
>
> Query 3A:
>
> SELECT * {
> ?s a foaf:Person .
> OPTIONAL { ?s :hairColor ?color . FILTER(?color = "brown") }
> FILTER(!bound(?color))
> }
>
> or something like that. This gives the one result from 1A, 1B, and 2A.

---------------------
| s         | color |
=====================
| :OtherLee |       |
---------------------

>
> Query 3B:
>
> SELECT * {
> OPTIONAL { ?s :hairColor ?color . FILTER(?color = "brown") }
> ?s a foaf:Person .
> FILTER(!bound(?color))
> }
>
> ...this gives no results, a la Query 2B - but, again, is this useful to
> anyone? I've never seen anyone use OPTIONAL/!bound in this way.

-------------
| s | color |
=============
-------------

OPTIONAL/!BOUND has been out and is used so I would not like to say 
whether and how it's used.


> Just for completeness sake, what about:
>
> Query 4A:
>
> SELECT * {
> ?s a foaf:Person .
> MINUS { ?s :hairColor "brown" }
> }
>
> Query 4B:
>
> SELECT * {
> MINUS { ?s :hairColor "brown" }
> ?s a foaf:Person .
> }

-------------
| s         |
=============
| :OtherLee |
-------------

>
> 4A gives - I think - the same answers as 1A, 1B, 2A, and 3A.
>
> What does 4B do? I guess it's equivalent to
>
> identity solution - { { (?s, :Lee) }, { (?s, :OtherLee) } }

Agreed:

-------------
| s         |
=============
| :OtherLee |
| :Lee      |
-------------

>
> ...because of the extra condition on MINUS, since the identity solution
> has no vars in common with the RHS, this doesn't remove the solution. So
> 4B has a different solution from everything else:
>
> ?s
> --
> :Lee
> :OtherLee
>
> Anyways. Does this help anything? I don't know.
>
> What I've heard is:
>
> AndyS: Wants to support doing the equivalent of Query 3B without
> requiring extra braces to get the scoping right.

and he also wants to provide the convenient syntax form.

We have a way of doing it - why choose only a more verbose form?

	Andy

> SteveH (& me): Doesn't think this is a particularly important issue.
>
> What do other people think? Did I even characterize this correctly?
>
> Lee
Received on Monday, 12 April 2010 12:32:13 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:42 GMT