W3C home > Mailing lists > Public > public-sparql-dev@w3.org > April to June 2016

Re: can subqueries be executed first in SPARQL?

From: Gregory Williams <greg@evilfunhouse.com>
Date: Thu, 16 Jun 2016 15:38:26 -0700
Cc: public-sparql-dev@w3.org
Message-Id: <D9D8D916-24DE-4BA4-9F3B-A6F147B3B432@evilfunhouse.com>
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
On Jun 16, 2016, at 2:36 PM, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
> 
> https://www.w3.org/TR/sparql11-query/#subqueries says
> 
> Due to the bottom-up nature of SPARQL query evaluation, the subqueries are
> evaluated logically first, and the results are projected up to the outer query.
> 
> I think that this is incorrect.  For example, in
> 
> SELECT ?x WHERE {
>  ?x :a :b .
>  FILTER EXISTS {
>    SELECT ?x WHERE { ?x :a :b } HAVING ( COUNT(*) = 1 )
>    }
>  }

I’m not sure this specific example makes sense. The subquery involves aggregation, but attempts to project a variable that is neither aggregated nor a part of a GROUP BY clause. The semantics of EXISTS variable substitution *might* give an intuitive answer because ?x will be replaced with a constant term during evaluation, but I believe the intention of §11.1 is for this to be a syntactic restriction.

FWIW, Andy’s SPARQL validator seems happy to produce an algebra for your example query, but if you try to validate just the subquery, it’ll complain about a syntax error.

> The inner select is not even known until the bindings for ?x in the outer
> query have been determined because EXISTS uses substitution into the inner
> query.  (Whether that is a good idea or not is a separate issue.)
> 
> I ran into this issue when reading https://scirate.com/arxiv/1606.01441
> I believe that the sentence I quote above is the one that the authors indicate
> that Fuseki and Blazegraph are using to support their implementation of
> subqueries inside EXISTS.
> 
> 
> I suggest that there be an erratum removing this sentence.


The wording here is a bit awkward, but I believe the subquery evaluation is still occurring first *within the context of the evaluating the EXISTS pattern*. For example, had your filter clause looked like:

?x :a :b .
FILTER EXISTS {
	?x ?y ?z
	SELECT ?x WHERE { ?x :a :b } HAVING ( COUNT(*) = 1 )
}

then the sub-query would be evaluated before the 1-triplepattern BGP {?x ?y ?z} that it joins with, but after {?x :a :b} is evaluated and the ?x variables are substituted in the EXISTS pattern body.

I think calling out the bottom-up semantics here is a good thing, but the text might have benefited from discussing how evaluating EXISTS patterns is different than evaluating other type of pattern.

.greg
Received on Thursday, 16 June 2016 22:38:52 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 16 June 2016 22:38:52 UTC