- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Wed, 6 Jul 2016 07:50:34 -0700
- To: Andy Seaborne <andy@apache.org>, public-sparql-exists@w3.org
[This is a revised version of a message I sent to public-sparql-dev@w3.org. I have added a bit more explanation at the beginning, fixed up one example to make it more on point, tied the examples more closely to the spec, and removed comments about counterintuitiveness.] Here are five separate problems that I see with the definition of EXISTS in https://www.w3.org/TR/2013/REC-sparql11-query-20130321/. The first two problems are situations where the evaluation of EXISTS hits undefined areas. The second of these has quite a few cases even just from a syntactic viewpoint. I do not know of any SPARQL implementations that produce an error for any of these undefined situations. The last three problems are situations where the evaluation of EXISTS is well-defined but at least some implementations diverge from the spec. Problem 1: Some uses of EXISTS are not defined during evaluation The evalution of exists in 18.6 is only defined for graph patterns, but in SELECT ?x WHERE { ?x :p :c . FILTER EXISTS { SELECT ?y { ?y :q :c . } } } the argument to exists ends up being a ToMultiSet, which is not listed under "Graph Pattern" in the table of SPARQL algebra symbols in 18.2. The argument to exists is not explicitly listed as a "Graph Pattern" when the argument to EXISTS is a GroupGraphPattern containing just a SubQuery or just an InlineData. Here the join is simplified away by section 18.2.2.8 leaving a construct in the SPARQL algebra that is not listed as a graph pattern symbol, ToMultiSet or a multiset, respectively. An example of where this happens (but not at the top level of an EXISTS) is the last example of 18.2.3. Problem 2: Substitution happens where definitions are only for variables In SELECT ?x WHERE { BIND ( :d AS ?x ) FILTER EXISTS { BIND ( :e AS ?z ) { SELECT ?x { :b :p :c } } } } the substitution from 18.6 ends up with a non-variable in the second argument to Project Join ( Extend( BGP(), ?z, :e ) , ToMultiSet( Project( ToList( BGP( :b :p :c )), { :d } ) ) ) However Project is only defined in 18.5 for variables in its second argument. This also affects Extend, multisets, BOUND, and maybe other constructs. Problem 3: Blank nodes substituted into BGPs act as variables In SELECT ?x WHERE { ?x :p :d . FILTER EXISTS { ?x :q :b . } } against the graph { _:c :p :d , :e :q :b } the substitution from 18.6 ends up producing BGP(_:c :q :b) when then matches against :e :q :b because the _:c can be mapped to :e by the RDF instance mapping that is part of pattern instance mappings in 18.3.1. Some implementations diverge from the spec here. Problem 4: Substitution can flip MINUS to its disjoint-domain case In SELECT ?x WHERE { ?x :p :c . FILTER EXISTS { ?x :p :c . MINUS { ?x :p :c . } } } on the graph { :d :p :c } the substitution from 18.6 ends up producing Minus( BGP( :d :p :c ), BGP( :d :p :c ) ) which produces a non-empty result because the two solution mappings for the Minus have disjoint domains and 18.5 dictates that then the result is not empty. Some implementations diverge from the spec here. Problem 5: Substitution affects disconnected variables In SELECT ?x WHERE { BIND ( :d AS ?x ) FILTER EXISTS { BIND ( :e AS ?z ) SELECT ?y WHERE { ?x :p :c } } } the substitution from 18.6 ends up producing Join ( Extend( BGP(), ?z :e ), ToMultiSet( Project( ToList( BGP( :d :p :c ) ), { ?y } ) ) ) Some, but not all, implementations diverge from the spec here. Peter F. Patel-Schneider Nuance Communications
Received on Wednesday, 6 July 2016 14:51:04 UTC