- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Sat, 29 Oct 2016 19:07:42 -0700
- To: Andy Seaborne <andy@apache.org>, public-sparql-exists@w3.org
Take two. Proposal A: ** Intuition/Outline Inject variable bindings into the top-level group graph pattern. ** Proposal Translate EXISTS{P} as exists(t,translate(Initial(t) P')) where P' is P fixed up to be syntactically suitable Translate Initial(t) as Initial(t) Given variable binding b, exists(t,P) is true iff eval(D(G),P') is non-empty where P' is P with Initial(t) replaced by b Proposal B: ** Intuition/Outline Ensure that the variables from the row being filtered are available, not just their values as in the spec with substitution. This is done by restricting the variable to the value of the row at the point where the variable is being bound. This is in a BGP. Use of BIND ( e AS ?x) for ?x in the current row is not allowed : ?x is considered to be already set. ** Proposal Rewrite the pattern of the EXISTS filter at algebra evaluation time; do this in a deep fashion, while respecting scoping rules. Let V = in-scope variables at the point of the FILTER EXISTS. For each BGP in the pattern: Build a VALUES with variables of V still in-scope at this BGP. << This needs to be specified. This includes empty BGPs Replace the BGP with Join(values,bgp) There are two differences here: 1/ Where to push the variable bindings. 2/ How to push them. The how is the less important difference. Proposal A pushes the bindings to the "beginnings" of group graph patterns via the Initial(t) where they affect each construct on the group graph pattern. Proposal B pushes the bindings to each BGP. This ends up in effect pushing the bindings to the beginning of each group graph pattern because there is always a BGP (maybe empty) at the beginning of the translation of each group graph pattern (IF THE SIMPLIFICATION STEP in 18.2.2.8 IS NOT DONE!). However, proposal B also pushes the bindings elsewhere in the translation of the group graph pattern, which doesn't appear to have any consequences. I don't see a significant difference between the two proposals in the how. Proposal A would do a little more at translate time and Proposal B would do a little more at algebra evaluation time. I think that if Proposal B was changed to only affect BGPs directly in the top-level group graph pattern then it would have the same results as Proposal A. The where is where the real difference between the two proposals lies. Proposal A only pushes to the beginning of the top-level group graph pattern so it makes an EXIST act very much like a VALUES at the beginning of the top-level group graph pattern. Proposal B pushes into every BGP, including BGPs inside OPTIONAL, MINUS, FILTER, subqueries, etc. Some of these extra changes make Proposal B work more like the current EXISTS, but some of them make changes from the current EXISTS. For example, a MINUS with no common variables could be changed into a MINUS with common variables, flipping its meaning. Proposal B might also have problems with EXISTS inside of EXISTS. (I haven't worked all the details out here so I'm not sure whether there is a problem.) Proposal B also requires a new notion of variable scoping for SPARQL to control which variables are pushed into sub-queries.
Received on Sunday, 30 October 2016 02:08:19 UTC