Re: EXISTS : ways forward

> Sorry for the silence - other things overtook my time and I couldn't
> find contiguous periods of time to work on the issues of EXISTS.
>
> I took a pass over all the issues and put down a brief suggestion for
> each one.  These are outline proposals, not detailed proposals.  As a
> first step, I want to get general direction for solutions aired.
>
>          Andy

My summary of Andy's proposed solutions is that it ends up forbidding too
many useful constructs.  I think that it is better to come up with a
definition for EXISTS that keeps these useful constructs.


> Outlines:


> **** Problem 1: Some uses of EXISTS are not defined during evaluation
>
> This should be treated as an erratum to SPARQL : anything that can a
> pattern can appear in EXISTS{}.  This applying ToMultiSet to get from a
> solution sequence back to a multiset for EXISTS.

Agreed.

> There are suggested text items at the end of:
>
https://github.com/w3c/sparql-exists/wiki/Problem:-Some-uses-of-EXISTS-are-not-defined-during-evaluation

>>>>>>>>>>>>>>>>>>>>>>>>>>**CHECK THIS OUT**


> **** Problem 2: Substitution happens where definitions are only for
> variables
>
> Suggestion: identify and forbid pattens that involve the substitution of
> certain constructs.
>
> BIND(... AS ?VAR)
> SELECT (... AS ?VAR)
SELECT ... ?VAR ...   <<<<<<<<<<<<<<<<  This one too??
> VALUES ?VAR
> VALUES(...?VAR...)

Presumably these would only be illegal if they are within, roughly, the
connected scope of ?VAR as the proposal here restricts substitution is being
restricted to, roughly, the connected scope.

> This can be a static or dynamic test - I prefer a static check which is
> done once after parsing (there are already such checks - e.g. "AS ?VAR"
> the ?VAR must not be an in-scope variable from the expression or pattern).

My view is that if there is a good way of retaining either a reasonable
meaning or what is generally implemented then that should be the goal
instead of forbidding constructs.  So the useful construction (well useful
if there is an initial query instead of a VALUES)
SELECT ?book WHERE {
  VALUES ?book { :book1 :book2 }
  FILTER EXISTS {
    VALUES ?book { :book2 } } }
should produce :book2 and not be illegal.

Similarly
  SELECT ?a WHERE {
    ?a :b :c .
    FILTER EXISTS {
      SELECT (?a1 AS ?a) WHERE {
        ?a1 :b :d } } }
should be legal, and produce those nodes that are linked via :b to both
:c and :d, and so should
  SELECT ?a WHERE {
    ?a :b :c .
    FILTER EXISTS {
      SELECT ?a WHERE {
        ?a :b :d } } }

This meaning for EXISTS makes BIND illegal due to scope violation, but I
don't see a way to save BIND and still keep the (silly) rule that BIND has
to introduce a new variable.

> With BOUND, the issue is more about syntax.  We can define
> "BOUND(someValue)" to have a replacement of "true"^^xsd:boolean in the
> algebra.

This would be an improvment for BOUND, I guess.


> **** Problem 3: Blank nodes substituted into BGPs act as variables
>
> The core issue is to treat bnodes from the data (pre-binds) differently
> from bnodes from the surface syntax.
>
> The scoping graph includes all blank nodes passed in to pre-binding.
>
> { ?s ?p ?o } => { ?s ?p ?o' . FILTER (sameterm(?o', _:b))  }
>
> while it is written in syntax above, the process happens in the
> algebra and in the algebra, blank nodes are concrete terms so it is
> meaningful to talk about them in filters etc. They do not behave like
> variables except in basic graph patterns.

> We can explain that this is the same as (implementation, simple
> entailment) treating only blank nodes in the original query, i.e. at he
> time of parsing, as BGP variables and then treating substitution blank
> nodes as constant terms.
>
> { ?s ?p _:b  } where _:b is matched concretely, not as a variable.

This seems rather complex and is only for blank nodes.


> **** Problem 4: Substitution can flip MINUS to its disjoint-domain case
>
> This isn't a problem per-se - it's a possibly unexpected outcome;
> evaluation is still defined.
>
> Proposal: we note the effect and advise again substitutions in the
> right-hand-side of the MINUS (pattern in MINUS { pattern }).
>
> Alternatively, like issue-2 forbid it.

I think that this flipping effect of EXISTS over MINUS is a big source of
potential errors in SPARQL queries.  (I think that the problem really comes
not from EXISTS but from the definition of MINUS, which should not flip its
meaning when there is no shared variable.)

The proposed solution appears to be to warn against using substitution
variables in the RHS of MINUS but to nonetheless do the sbustitution there.
This would treate the RHS of MINUS specially, as that is not part of the
connected scope of a substitution variable.

It would be useful to see how different implementations handle this.  I can
think of several implementation strategies that would diverge from the
standard.


> **** Problem 5: Substitution affects disconnected variables
>
> We use the scoping rules and define substitution only for variables in a
> potion that is in-scope of the outermost {}.
>
> Optionally, we can explain this (an implementation note) the same as
> replacing in-scope variables with a different name. If v is not in scope
> on the outer {} of EXISTS {}, then replace with a different, unused name.
>
> 18.2.1 Variable Scope
> https://www.w3.org/TR/sparql11-query/#variableScope

It does seem reasonable that EXISTS doesn't break scoping.  Implementations
differ here, but it seems that most differ from the definition of EXISTS in
this way.  One problem is that scoping is poorly defined and its definition
would need to be upgraded to serve in the definition of EXISTS.



Another way of proceeding, which I have advocated before, is to replace the
substitution definition for EXISTS with one based on setting up initial
solution sequences, i.e., the algebra equivalent of putting a generalized
VALUES at the front of the EXISTS argument, So
  VALUES ?book { :book1 :book2 }
  FILTER EXISTS {
    VALUES ?book { :book2 } }
would evaluate the analogue of
  VALUES ?book { :book1 :book2 }
  VALUES ?book { :book2 }
for the EXISTS.  Scoping would be augmented so that the  variables would be
in-scope at the beginning of the
EXISTS argument.

This does what I consider to be the right thing in all cases that a right
thing can be determined.  It also doesn't need any special cases nor does it
need the extended definition for BOUND.  It also doesn't directly depend on
scoping.

This solution has the benefit that EXISTS subsitution doesn't cause
flipping of MINUS.  It does, of course, change how EXISTS interacts with
MINUS.


peter

Received on Tuesday, 20 September 2016 07:36:25 UTC