Re: Discussion for replacement in proposal B (early binding) from Peter F. Patel-Schneider on 2016-11-25 (public-sparql-exists@w3.org from November 2016)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Fri, 25 Nov 2016 12:17:15 -0800
To: Andy Seaborne <andy@apache.org>, public-sparql-exists@w3.org
Message-ID: <4456b7de-a31f-73fc-353e-584c0f7f3a7b@gmail.com>
What I am missing here is the intent behind the proposal.

The intent behind my proposal (propsal A) is to be as simple as possible while
fixing all of the known problems.  If this changes some of the weird
situations then so be it.

The intent behind this proposal appears to keep results the same, except for
the problematic cases.  However, this is not stated anywhere that I can see.
It is also the case that this proposal changes some non-problematic results.

So, just what is this proposal supposed to be doing?

peter


On 11/17/2016 05:40 AM, Andy Seaborne wrote:
> Here is a attempt to define the replacement process for proposal B (deep
> binding injection).  This message is the descriptive version, the next message
> is a more formal description.
> 
> Suggestions for improving the definitions very welcome! (I am a little out of
> practice writing definitions...)
> 
> 
> This definition applies to algebra expressions in EXISTS for legal SPARQL
> queries.  The rules for a SPARQL query preclude certain algebra expressions
> like "extend" (i.e BIND) of a variable which is also in a BGP that "extend"
> applies to.
> 
> The approach is to alter the places where variables get bound by joining the
> bindings for the solution mapping being filtered (the "current row").  This
> restricts the range of values a variable can take to just the value in the
> solution mapping being filtered.  (FILTER sameTerm would do the same thing.)
> 
> Assignment (an "AS") to a variable which is in-scope for the FILTER is not
> allowed. Any potential variables in a "current row", are considered in-scope
> for the EXIST expression.
> 
> 18.2.1 Variable Scope
> https://www.w3.org/TR/sparql11-query/#variableScope
> 
> 
> Putting all the variables of the current row into the replacement, not just
> the ones in a replaced BGP makes FILTERS with variables from the current row
> work:
> 
> {
>   ?x :predicate ?y
>   FILTER EXISTS { ?x :predicate ?v . FILTER ( ?v < ?y ) }
> }
> 
> then if the current row is: ?x=<http://example/resource>, ?y=123
> 
>     EXISTS {
>           { ?x :predicate ?v .
>             VALUES(?x ?y) { (<http://example/resource> 123) }
>           }
>           FILTER ( ?v < ?y )
>     }
> 
> except this is done on the algebra not the syntax.
> 
> This process works for compound forms:
> 
> {
>   ?x :predicate ?y
> 
>   FILTER EXISTS {
>       GRAPH <http://example/graph> {
>           ?x :predicate ?v .
>           FILTER ( ?v < ?y ) }
>       }
> }
> 
> {
>   ?x :pred ?y
> 
>   FILTER EXISTS {
>       {
>           ?x :predicate1 ?v .
>           FILTER ( ?v < ?y ) }
>       } UNION {
>           ?x :predicate2 ?v .
>           FILTER ( ?v < ?y ) }
>       }
> }
> 
> It must deal with uses of the same variable name inside sub-SELECT. A
> sub-SELECT introduces projection so it can hide variables.  The name of such
> variables inside the projection does not matter - systematic renaming of any
> non-projected variable does not change the results.
> 
> SELECT * {
>    ?x :predicate ?y
>      { SELECT ?x
>       { ?x :predicate ?v .
>         FILTER(?v < 123)
>       } }
>   }
> 
> the "?v" is not part of the results, nor can be joined with a variable outside
> the sub-select because only "?x" is in the projection.
> 
> This has the same results:
> 
> SELECT * {
>    ?x :predicate ?y
>      { SELECT ?x
>       { ?x :predicate ?Z .
>         FILTER(?Z < 123)
>       } }
>   }
> 
> The proposal is to rename these hidden variables so they use different names
> to the current row being filtered.
> 
> NB This has a consequence that if a query wishes to filter on a variable from
> the current row, it must be in the projection.
> 
> This could be changed by modifying the definition of renaming in the other
> message.
> 
> 
> Then replace any form "x" which is a BGP, path or "GRAPH ?g" with join(x,
> current row)
> 
> "AS ?var" where ?var comes from any current row (the set of varables in-scope
> by the current scope rules at the FILTER) is not allowed.
> 
> Blank nodes are treated as constants. This is in the algebra where blank nodes
> in solution mappings are not acting as variables.
> 
> 
> Current unclear:
> 
> (this is an example from SHACL, not that it will necessarily remain this query
> and is required to be defined in SPARQL):
> 
> 4.3.2 sh:maxCount
> 
> SELECT $this
> WHERE {
>     $this $PATH ?value .
> }
> GROUP BY $this
> HAVING (COUNT(DISTINCT ?value) > $maxCount)
> 
> so for an EXISTS filter:
> 
> EXISTS {
>     SELECT $this {
>     $this $PATH ?value .
> }
> GROUP BY $this
> HAVING (COUNT(DISTINCT ?value) > $maxCount)
> 
> 
> (Peter - in proposal A, the initial(t) is outside the HAVING)
> 
> There is no place currently to inject the $maxCount.
> 
> Pretending there is empty BGP at the start works in this similar form:
> 
> SELECT $this {
>    BIND(1 AS $dummy)
>    { SELECT $this (COUNT(DISTINCT ?value) AS ?X) {
>        $this $PATH ?value .
>       }
>       GROUP BY $this
>    }
>    FILTER(?X > $maxCount)
> }
> 
>     Andy
> 
> 
> 
>
Received on Friday, 25 November 2016 20:17:49 UTC