Re: Binding injection proposals from Peter F. Patel-Schneider on 2016-10-30 (public-sparql-exists@w3.org from October 2016)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Sat, 29 Oct 2016 19:07:42 -0700
To: Andy Seaborne <andy@apache.org>, public-sparql-exists@w3.org
Message-ID: <23374600-cb77-3a33-ac23-267562fb581d@gmail.com>

Take two.



Proposal A:

** Intuition/Outline

Inject variable bindings into the top-level group graph pattern.

** Proposal

Translate EXISTS{P} as  exists(t,translate(Initial(t) P'))
  where P' is P fixed up to be syntactically suitable
Translate Initial(t) as Initial(t)
Given variable binding b,
  exists(t,P) is true iff eval(D(G),P') is non-empty
  where P' is P with Initial(t) replaced by b


Proposal B:

** Intuition/Outline

Ensure that the variables from the row being filtered are available, not
just their values as in the spec with substitution.

This is done by restricting the variable to the value of the row at the
point where the variable is being bound.  This is in a BGP.   Use of BIND (
e AS ?x) for ?x in the current row is not allowed : ?x is considered to be
already set.

** Proposal

Rewrite the pattern of the EXISTS filter at algebra evaluation time; do this
in a deep fashion, while respecting scoping rules.

Let V = in-scope variables at the point of the FILTER EXISTS.

For each BGP in the pattern:
        Build a VALUES with variables of V
 still in-scope at this BGP. << This needs to be specified.
        This includes empty BGPs
 Replace the BGP with Join(values,bgp)


There are two differences here:

1/ Where to push the variable bindings.
2/ How to push them.


The how is the less important difference.  Proposal A pushes the bindings to
the "beginnings" of group graph patterns via the Initial(t) where they
affect each construct on the group graph pattern.  Proposal B pushes the
bindings to each BGP.  This ends up in effect pushing the bindings to the
beginning of each group graph pattern because there is always a BGP (maybe
empty) at the beginning of the translation of each group graph pattern (IF
THE SIMPLIFICATION STEP in 18.2.2.8 IS NOT DONE!).  However, proposal B also
pushes the bindings elsewhere in the translation of the group graph pattern,
which doesn't appear to have any consequences.

I don't see a significant difference between the two proposals in the how.
Proposal A would do a little more at translate time and Proposal B would do
a little more at algebra evaluation time.

I think that if Proposal B was changed to only affect BGPs directly in the
top-level group graph pattern then it would have the same results as
Proposal A.


The where is where the real difference between the two proposals lies.
Proposal A only pushes to the beginning of the top-level group graph pattern
so it makes an EXIST act very much like a VALUES at the beginning of the
top-level group graph pattern.  Proposal B pushes into every BGP, including
BGPs inside OPTIONAL, MINUS, FILTER, subqueries, etc.  Some of these extra
changes make Proposal B work more like the current EXISTS, but some of them
make changes from the current EXISTS.  For example, a MINUS with no common
variables could be changed into a MINUS with common variables, flipping its
meaning.  Proposal B might also have problems with EXISTS inside of EXISTS.
(I haven't worked all the details out here so I'm not sure whether there is
a problem.)

Proposal B also requires a new notion of variable scoping for SPARQL to
control which variables are pushed into sub-queries.

Received on Sunday, 30 October 2016 02:08:19 UTC