W3C home > Mailing lists > Public > public-sparql-dev@w3.org > April to June 2016

Re: can subqueries be executed first in SPARQL?

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Thu, 16 Jun 2016 16:11:39 -0700
To: Gregory Williams <greg@evilfunhouse.com>
Cc: public-sparql-dev@w3.org
Message-ID: <0a44364e-84bb-881d-6bd8-8585df460164@gmail.com>
On 06/16/2016 03:38 PM, Gregory Williams wrote:
> On Jun 16, 2016, at 2:36 PM, Peter F. Patel-Schneider
<pfpschneider@gmail.com> wrote:
>>
>> https://www.w3.org/TR/sparql11-query/#subqueries says
>>
>> Due to the bottom-up nature of SPARQL query evaluation, the subqueries are
>> evaluated logically first, and the results are projected up to the outer query.
>>
>> I think that this is incorrect.  For example, in
>>
>> SELECT ?x WHERE {
>>  ?x :a :b .
>>  FILTER EXISTS {
>>    SELECT ?x WHERE { ?x :a :b } HAVING ( COUNT(*) = 1 )
>>    }
>>  }
>
> I’m not sure this specific example makes sense. The subquery involves
> aggregation, but attempts to project a variable that is neither aggregated
> nor a part of a GROUP BY clause. The semantics of EXISTS variable
> substitution *might* give an intuitive answer because ?x will be replaced
> with a constant term during evaluation, but I believe the intention of §11.1
> is for this to be a syntactic restriction.

> FWIW, Andy’s SPARQL validator seems happy to produce an algebra for your
> example query, but if you try to validate just the subquery, it’ll
> complain about a syntax error.

OOPS.  In my attempt to make an easy example I forgot that I needed to
worry about which variables can be reported back.  I also should avoid other
known problems with EXISTS (like using an in-scope variable as a SELECT
variable).

One fix is to go to something like
  SELECT ?x WHERE {
    ?x :a ?y .
    FILTER EXISTS {
      SELECT ?z WHERE { ?z ?w ?v . FILTER sameTerm(?x,?y) }
    }
  }
where evaluating the inner SELECT logically first results in no solutions
because ?x and ?y are unbound.

>> The inner select is not even known until the bindings for ?x in the outer
>> query have been determined because EXISTS uses substitution into the inner
>> query.  (Whether that is a good idea or not is a separate issue.)
>>
>> I ran into this issue when reading https://scirate.com/arxiv/1606.01441
>> I believe that the sentence I quote above is the one that the authors indicate
>> that Fuseki and Blazegraph are using to support their implementation of
>> subqueries inside EXISTS.
>>
>>
>> I suggest that there be an erratum removing this sentence.
>
>

> The wording here is a bit awkward, but I believe the subquery evaluation
> is still occurring first *within the context of the evaluating the EXISTS
> pattern*.

There is no such qualifier in the wording.

For example, had your filter clause looked like:
>
> ?x :a :b .
> FILTER EXISTS {
> 	?x ?y ?z
> 	SELECT ?x WHERE { ?x :a :b } HAVING ( COUNT(*) = 1 )
> }
>
> then the sub-query would be evaluated before the 1-triplepattern BGP {?x
> ?y ?z} that it joins with, but after {?x :a :b} is evaluated and the ?x
> variables are substituted in the EXISTS pattern body.
>
> I think calling out the bottom-up semantics here is a good thing, but the
> text might have benefited from discussing how evaluating EXISTS patterns
> is different than evaluating other type of pattern.

I don't consider calling out something that is not true a good thing.

> .greg

peter
Received on Thursday, 16 June 2016 23:12:09 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 16 June 2016 23:12:10 UTC