Re: a catalog of problems with EXISTS

This was never intended to be a proposal for changes, just a catalog of
problems.  In some cases here I expect that there will be quick agreement on
what changes to make, in other cases I expect that there will be quick
agreement on what results should ensue, but in some cases there may be
discussion needed to determine just what the result should be.

To have intended results for some of these example is, I think, putting the
cart before the horse.


Several of these examples don't directly affect SHACL, the proposed result of
the W3C Data Shapes Working Group.  In any case, the SPARQL queries used by
SHACL can be changed in response to changes that this group recommends.  What
is needed for SHACL is a set of changes to the spec that provide a
well-defined, clear, and unambiguous meaning for EXISTS and that is
implemented.  That said, there are parts of the SPARQL spec that don't work
well with SHACL.   I'll produce a separate message on this.

peter


On 07/06/2016 08:16 AM, james anderson wrote:
> good afternoon;
> 
> please record somewhere the datasets and the intended results for these examples.
> 
> in order to think about a proposal for what the recommendation should say, it
> would help to know what the results should be.
> for this, it would be particularly helpful, to have concrete examples from the
> shapes use cases.
> 
>> On 2016-07-06, at 16:50, Peter F. Patel-Schneider <pfpschneider@gmail.com
>> <mailto:pfpschneider@gmail.com>> wrote:
>>
>> [This is a revised version of a message I sent to public-sparql-dev@w3.org
>> <mailto:public-sparql-dev@w3.org>.  I
>> have added a bit more explanation at the beginning, fixed up one example to
>> make it more on point, tied the examples more closely to the spec, and removed
>> comments about counterintuitiveness.]
>>
>>
>>
>> Here are five separate problems that I see with the definition of EXISTS in
>> https://www.w3.org/TR/2013/REC-sparql11-query-20130321/.
>>
>> The first two problems are situations where the evaluation of EXISTS hits
>> undefined areas.  The second of these has quite a few cases even just from a
>> syntactic viewpoint.  I do not know of any SPARQL implementations that
>> produce an error for any of these undefined situations.
>>
>> The last three problems are situations where the evaluation of EXISTS is
>> well-defined but at least some implementations diverge from the spec.
>>
>>
>>
>> Problem 1: Some uses of EXISTS are not defined during evaluation
>>
>> The evalution of exists in 18.6 is only defined for graph patterns, but in
>>  SELECT ?x WHERE {
>>    ?x :p :c .
>>    FILTER EXISTS { SELECT ?y { ?y :q :c . } } }
>> the argument to exists ends up being a ToMultiSet, which is not listed under
>> "Graph Pattern" in the table of SPARQL algebra symbols in 18.2.
>>
>> The argument to exists is not explicitly listed as a "Graph Pattern" when
>> the argument to EXISTS is a GroupGraphPattern containing just a SubQuery or
>> just an InlineData.  Here the join is simplified away by section 18.2.2.8
>> leaving a construct in the SPARQL algebra that is not listed as a graph
>> pattern symbol, ToMultiSet or a multiset, respectively.  An example of where
>> this happens (but not at the top level of an EXISTS) is the last example of
>> 18.2.3.
>>
>>
>> Problem 2: Substitution happens where definitions are only for variables
>>
>> In
>>  SELECT ?x WHERE {
>>    BIND ( :d AS ?x )
>>    FILTER EXISTS { BIND ( :e AS ?z ) { SELECT ?x { :b :p :c } } } }
>> the substitution from 18.6 ends up with a non-variable in the second argument
>> to Project
>>  Join ( Extend( BGP(), ?z, :e ) ,
>>         ToMultiSet( Project( ToList( BGP( :b :p :c )), { :d } ) ) )
>> However Project is only defined in 18.5 for variables in its second argument.
>>
>> This also affects Extend, multisets, BOUND, and maybe other constructs.
>>
>>
>> Problem 3: Blank nodes substituted into BGPs act as variables
>>
>> In
>>  SELECT ?x WHERE {
>>    ?x :p :d .
>>    FILTER EXISTS { ?x :q :b . } }
>> against the graph { _:c :p :d , :e :q :b }
>> the substitution from 18.6 ends up producing
>>  BGP(_:c :q :b)
>> when then matches against :e :q :b because the _:c can be mapped to :e by
>> the RDF instance mapping that is part of pattern instance mappings in
>> 18.3.1.
>>
>> Some implementations diverge from the spec here.
>>
>>
>> Problem 4: Substitution can flip MINUS to its disjoint-domain case
>>
>> In
>>  SELECT ?x WHERE {
>>    ?x :p :c .
>>    FILTER EXISTS { ?x :p :c . MINUS { ?x :p :c . } } }
>> on the graph { :d :p :c }
>> the substitution from 18.6 ends up producing
>>  Minus( BGP( :d :p :c ), BGP( :d :p :c ) )
>> which produces a non-empty result because the two solution mappings for the
>> Minus have disjoint domains and 18.5 dictates that then the result is not
>> empty.
>>
>> Some implementations diverge from the spec here.
>>
>>
>> Problem 5: Substitution affects disconnected variables
>>
>> In
>>  SELECT ?x WHERE {
>>    BIND ( :d AS ?x )
>>    FILTER EXISTS { BIND ( :e AS ?z )
>>                    SELECT ?y WHERE { ?x :p :c } } }
>> the substitution from 18.6 ends up producing
>>  Join ( Extend( BGP(), ?z :e ),
>>         ToMultiSet( Project( ToList( BGP( :d :p :c ) ), { ?y } ) ) )
>>
>> Some, but not all, implementations diverge from the spec here.
>>
>>
>>
>> Peter F. Patel-Schneider
>> Nuance Communications
>>
>>
> 
> 
> 
> ---
> james anderson | james@dydra.com <mailto:james@dydra.com> | http://dydra.com
> 
> 
> 
> 
> 

Received on Wednesday, 6 July 2016 16:13:56 UTC