RE: More on MINUS vs. UNSAID from Seaborne, Andy on 2009-07-11 (public-rdf-dawg@w3.org from July to September 2009)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Sat, 11 Jul 2009 17:41:44 +0000
To: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <B6CF1054FDC8B845BF93A6645D19BEA3646F2BC78C@GVW1118EXC.americas.hpqcorp.net>

ISSUE-29

LeeF asked in the telecon about where NOT EXISTS and MINUS are the same and whether there is a proof for this.

In starting to try to answer this (and this email doesn't answer it) it occurred to me to think about what problem they are solving and how they relate to OPTIONAL/!BOUND.


== NOT EXISTS is negation as failure (NAF) 

I'll write the FILTER in here - it's an issue to consider about the special FILTER scope rules - these affect OPTIONAL/!BOUND as well but because that's across block anyway, it does not trip people up much although it's possible.

  { ?x :q ?a 
    FILTER(NOT EXISTS{?a :p ?b})
  }

So as a FILTER you can read that as
"""
Given the bindings in the current solution being filtered, does {pattern} match the dataset? If it does, then exclude the current solution; if not keep the current solution.
"""

The NOT EXISTS is NAF using the current solution to restrict the pattern in the NOT EXISTS; it's directly testing that pattern (the second one) against the dataset.

  { some_pattern
    FILTER(NOT EXISTS{ })
  }

is empty.  {} always matches, NOT is always false.  This seems natural to me.

There is a strong similarity with an ASK subquery.  Testing to see is a triple is present or not in the data is (using EXISTS rather than a double negative "! NOT EXISTS")

  ASK { FILTER(EXISTS{<x> <o> <z>} }

And the FILTER word is not needed

  ASK { EXISTS{<x> <o> <z>} }

Although 

  ASK { <x> <o> <z> }

is easier still :-) but that's the connection with an ASK-subquery to me.


== MINUS is a result set operation

The MINUS operation does not directly involve the dataset in the removal - it takes the results of two patterns and removes certain rows in the LHS if they are compatible with a row in the RHS.  There isn't a direct NAF test against the dataset.

I'm no longer sure what's its intuitive meaning is in the general case.  When one variable is involved, it's removing individuals from a set of possibilities which is fine and here MINUS and NOT EXISTS seem to be the same.  
But when there are two or more variables, I don't see an intuitive meaning to the operation other than a manipulation of tables. 


== OPTIONAL/!BOUND

This is the hard-to-use way.

Because it involves OPTIONAL, the complex cases of doubly nested optionals affect the pattern (these are the cases where evaluation can't proceed by a substitution algorithm).  The complicated cases when used for negation need two or more variables (one must be introduced by the nested optional).

Because it’s a filter, it removes rows of the initial pattern on a row-by-row basis.

The {} case here requires some dancing to introduce a variable: 

  { pattern
    OPTIONAL{ SELECT (1 as ?uniqueVar) {} }  ## Forgive the subquery to do introduction.
  FILTER(!BOUND(?uniqueVar))
  }

Which has no results because 
  OPTIONAL{ SELECT (1 as ?uniqueVar) {} }
has result 
  ?uniqueVar=1
So it's bound and !bound is false always.

Downside here is that if the OPTIONAL matches multiple times for the same initial conditions, then extra rows are introduced.

 Andy

Received on Saturday, 11 July 2009 17:44:23 UTC