Re: Negation decision : unexpected effects from Lee Feigenbaum on 2010-04-06 (public-rdf-dawg@w3.org from April to June 2010)

From: Lee Feigenbaum <lee@thefigtrees.net>
Date: Tue, 06 Apr 2010 16:15:16 -0400
To: Andy Seaborne <andy.seaborne@talis.com>
CC: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <4BBB9654.4090401@thefigtrees.net>
On 4/5/2010 4:42 PM, Andy Seaborne wrote:
>
>
> On 05/04/2010 3:53 AM, Lee Feigenbaum wrote:
>> On 4/2/2010 5:05 PM, Andy Seaborne wrote:
>>> The decision at F2F3 to have just the form of NOT EXISTs in an explicit
>>> FILTER has a limitation. I should have realised at the time but it
>>> didn't occur to me until after the meeting.
>>>
>>> FILTERs get moved to the end of the BGP during translation from syntax
>>> to algebra. The form without the word "FILTER" does not move e.g.
>>>
>>> { ?s rdf:type :T
>>> NOT EXISTS { ?s :p ?v . }
>>> ?s :q ?v
>>> }
>>>
>>> then NOT EXISTS is not moved about by the FILTER placement rules.
>>
>> Yes, this is a difference. In most cases it doesn't matter, though,
>> right? I'd like to understand better the cases in which moving/not moving
>> the NOT EXISTS changes the answers.
>
> I am more concerned that is can change the answers in strange ways.
> Because it's negation, I think the effects will be particularly strange.
>
> I did find an old email:
>
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JulSep/0468.html
>
> :-)

Wow, the more things change, the more they stay the same, eh? :)

So based on that mail, here is an example of the scoping issue here:

Data:

:Lee a foaf:Person ; :hairColor "brown" .
:OtherLee a foaf:Person ; :hairColor "blond" . # when I was much younger

Query 1A:

SELECT * {
   ?s a foaf:Person .
   FILTER (NOT EXISTS { ?s :hairColor "brown" })
}

Query 1B:

SELECT * {
   ?s a foaf:Person .
   NOT EXISTS { ?s :hairColor "brown" }
}

This has the same answer in both cases:

   { { (?s, :OtherLee) } }

Query 2A:

SELECT * {
   FILTER (NOT EXISTS { ?s :hairColor "brown" })
   ?s a foaf:Person .
}

Query 2B:

SELECT * {
   NOT EXISTS { ?s :hairColor "brown" }
   ?s a foaf:Person .
}

2A is the same as 1A because FILTERs execute at the end of the group.

2B, however, is algebraically something like:

Join(NotExists(BGP(), BGP(?s :hairColor "brown")), BGP(?s a foaf:Person))

BGP() - evaluates to the identity solution set - one row with no bindings:
   { { } }

BGP(?s :hairColor "brown") evaluates to a non-empty solution set. Since 
the solutions in that solution set are compatible with the empty 
solution (the one solution in the identity solution set), this evaluates 
to no answers

so you have Join({}, BGP(?s a foaf:Person)) which is empty (no solutions).

I believe this difference only occurs in the case of what the Chileans 
call not-well-formed queries - a variable occurs inside the NOT EXISTS, 
but not on the left-hand side of the NOT EXISTS, and also occurs 
elsewhere in the query. (Same as with procedural vs. compositional 
OPTIONAL.)

But what Steve was saying on today's TC (and I think I agree with) is 
... so what? Are there any useful cases where we need the behavior of 
Query 2B? I don't know of any.

...

What would this look like with OPTIONAL/!BOUND?

Query 3A:

SELECT * {
   ?s a foaf:Person .
   OPTIONAL { ?s :hairColor ?color . FILTER(?color = "brown") }
   FILTER(!bound(?color))
}

or something like that. This gives the one result from 1A, 1B, and 2A.

Query 3B:

SELECT * {
   OPTIONAL { ?s :hairColor ?color . FILTER(?color = "brown") }
   ?s a foaf:Person .
   FILTER(!bound(?color))
}

...this gives no results, a la Query 2B - but, again, is this useful to 
anyone? I've never seen anyone use OPTIONAL/!bound in this way.

Just for completeness sake, what about:

Query 4A:

SELECT * {
   ?s a foaf:Person .
   MINUS { ?s :hairColor "brown" }
}

Query 4B:

SELECT * {
   MINUS { ?s :hairColor "brown" }
   ?s a foaf:Person .
}

4A gives - I think - the same answers as 1A, 1B, 2A, and 3A.

What does 4B do? I guess it's equivalent to

  identity solution - { { (?s, :Lee) }, { (?s, :OtherLee) } }

...because of the extra condition on MINUS, since the identity solution 
has no vars in common with the RHS, this doesn't remove the solution. So 
4B has a different solution from everything else:

?s
--
:Lee
:OtherLee

Anyways. Does this help anything? I don't know.

What I've heard is:

AndyS: Wants to support doing the equivalent of Query 3B without 
requiring extra braces to get the scoping right.

SteveH (& me): Doesn't think this is a particularly important issue.

What do other people think? Did I even characterize this correctly?

Lee
Received on Tuesday, 6 April 2010 20:15:58 UTC