proposal for a mapping-based definition of EXISTS in SPARQL (and a proposal for a definition of pre-binding)

Given that the substitution-based definition of EXISTS has lots of problems
is there the possibility of a different kind of definition?  One possibility
is to use a definition based on solution mappings.  The core of this
definition is the injection of solution mappings into the argument of
EXISTS, basically moving the FILTER solution mapping to the beginning of
this piece of code.

There are several options in a mapping-based definition of EXISTS but I
think that the simplest looks like this.

0/ Fix the scoping rules so that variables in-scope at a FILTER are in-scope
   at the beginning of the pattern argument to any EXISTS in the FILTER
   expression.  This is independent of the change to a mapping-based
   definition but fixes an error that affects EXISTS.

1/ Add a new construct, Initial, to the SPARQL syntax and also to the SPARQL
   algebra.  Initial will be used to set up the initial multiset of solution
   mappings inside an EXISTS.  It will work much like VALUES except that it
   will transfer solution mappings through the EXISTS instead of setting up
   a constant solution mapping.  Initial is a new option for
   GraphPatternNotTriples in the SPARQL syntax but cannot show up in actual
   SPARQL code, just during the SPARQL translation process.  Initial takes a
   single argument that is a token (which can just be an integer).

2/ When collecting FILTER elements replace EXISTS{P} in the filter
   expression with exists(Initial(t),translate( { Initial(t) P' } )) where t
   is a fresh token and similarly for NOT EXISTS{P}.  If P is a SubSelect
   then P' is { P } otherwise P' is just P.

3/ Translate Initial(t) as itself.

4/ Change the definition of the exists function to:

   Let u be the current solution mapping for a filter, t a token, and P a
   graph pattern: The value exists(Initial(t),P) given D(G) is true iff
   eval(D(G),P') is a non-empty sequence, where P' is P with (its sole
   occurence of) Initial(t) replaced by { u }.

   Yes, this is a bit sloppy, but less sloppy than what happens for
   inline data.

That's it!  No more substitute.


Is this the best way to do this starting from scratch?  Probably not, but I
think that this is a way to fix EXISTS without making too many changes to
the definitions underlying SPARQL.

I believe that this means that all subqueries in SPARQL queries can then be
evaluated completely bottom up and independently of anything outside of the
subquery.  In particular, subqueries inside EXISTS can be evaluated before
the solution mappings going into the FILTER are known.


Pre-binding works in a very similar way.  Pre-binding consists of two
operations.  Prepare is an operation on a query and a list of variables that
modifies the WhereClause GroupGraphPattern by adding Initial(0) as in 2/
above and also makes the variables in-scope at the beginning of the pattern.
Execute then takes a prepared query and a multiset of solution bindings
containing only bindings for the variables from the Prepare, replaces the
Initial(0) with this multiset, and then evaluates the result.


It would be possible to treat EXISTS { SELECT ... } as pre-binding.  This
would involve putting the Initial(t) inside the SELECT, where it can affect
expressions, instead of outside, where it does not.  The bindings injected
could even be filtered to remove non-projected variables.  Other, even more
complex schemes are possible.  Of course, when using pre-binding it would no
longer be the case that the SELECT can be evaluated before the solution
mappings going into the FILTER are known.


I'm not claiming that there is anything deep or even new here.  I think that
this treatment has very similar behaviour to that in Fuseki and Blazegraph,
or maybe even exactly the same behaviour as there.

Peter F. Patel-Schneider
Nuance Communications

Received on Saturday, 18 June 2016 00:01:13 UTC