W3C home > Mailing lists > Public > public-sparql-dev@w3.org > July to September 2018

Re: Subqueries sharing variables with outer

From: Michael Schmidt <m.schmidt00@gmail.com>
Date: Sat, 15 Sep 2018 00:16:22 -0700
Message-ID: <CAFPn7pHxkM23_ZSTsiVYA32aLR5fqBgnjs1fi1THWHKFWm6Vxw@mail.gmail.com>
To: public-sparql-dev@w3.org
Hi Reto,

this is a tough one, see my answers inline.

> With 5 Triples stores I'm getting 3 different results for the following query:
>
> select ?x ?y where {
>   values ?x { 1 2 }
>    OPTIONAL {
>      select ?y where {
>          {
>              values ?y { 5 6  }
>          }  UNION {
>              bind (?x as ?y)
>          }
>      }
>  }
>}
My understanding is as follows:

- The SELECT subquery is evaluated first. In particular, the outer
bindings for ?x are not visible in the bind clause.
- The first part of the UNION therefore returns:
+---+
| y |
+---+
| 5 |
| 6 |
+---+- The second part of the UNION may be a little disputable. The
bind (?x as ?y) is translated into Extend(G, var, expr), which is
defined as follows:

Extend(μ, var, expr) = μ ∪ { (var,value) | var not in dom(μ) and value =
expr(μ) }

Extend(μ, var, expr) = μ if var not in dom(μ) and expr(μ) is an error

....

The question now becomes what ?x({}) evaluates to -- the two options are
UNDEF vs. error. Section 10 of the standard states that "If the evaluation
of the expression produces an error, the variable remains unbound for that
solution but the query evaluation continues." -- so this pretty much
implies that the second part of the UNION produces the empty binding (case
1 of the Extend definition above).

- Taking both parts of the UNION together, we would thus get 3 solutions
(empty solution means UNDEF):

+---+
| y |
+---+
| 5 |
| 6 |
|   |
+---+

- Propagating this solution upwards and joining it with

+---+
| x |
+---+
| 1 |
| 2 |
+---+

from outside, we would end up with the second solution.


> According to the spec subqueries are evaluated logically first [1], however I noticed implementations behaving differently and a conversation [2] on this list in June 2016 questioning this requirement.
Note that the discussion in [2] was specifically centered around
SELECT subqueries nested inside FILTER (NOT) EXISTS clauses and (in my
understanding) does not really apply here.

>From RDF4J persistent store (in a  FluidOps information workbench) I get the following answer:
+---+---+
| x | y |
+---+---+
| 1 | 5 |
| 1 | 6 |
| 1 | 1 |
| 2 | 5 |
| 2 | 6 |
| 2 | 2 |
+---+---+=> RDF4J seems to not follow a bottom-up strategy here, looks
wrong to me

 >From Fuseki, GraphDB and StarDog I get the following answer:
+---+---+
| x | y |
+---+---+
| 1 | 5 |
| 1 | 6 |
| 1 |   |
| 2 | 5 |
| 2 | 6 |
| 2 |   |
+---+---+
=> as discussed above, this is what I would have expected

>From virtuoso store (htttp://dbpedia.org/sparql) I get the following answer:
| x | y |
+---+---+
| 1 |   |
| 2 |   |
+---+---+
=> this is odd, not sure what's the rational behind this answer, looks
pretty wrong to me

Best,

Michael


> I assume that the majority is right with the second result. Is this assumption correct?
>
>Cheers,
>Reto
> 1. https://www.w3.org/TR/sparql11-query/#subqueries
> 2. https://lists.w3.org/Archives/Public/public-sparql-dev/2016AprJun/0026..htmlBest,
Received on Saturday, 15 September 2018 20:18:06 UTC

This archive was generated by hypermail 2.3.1 : Saturday, 15 September 2018 20:18:07 UTC