- From: Andy Seaborne <andy@apache.org>
- Date: Thu, 17 Dec 2015 13:27:28 +0000
- To: public-sparql-dev@w3.org
On 16/12/15 13:57, Gary King wrote: > Hi all, Hi Gary, > > I’ve a question on the model query writers should have when reasoning about their work. I think the naive view would be that adding grouping would _not_ alter the meaning of the query but this is not the case. > > A first example comes from OPTIONAL. A query like > > select * { > ?a :ppp1 ?c . > { > optional { ?a :ppp2 ?d } > } > } > > Is technically equivalent to > > select * { > ?a :ppp1 ?c . > ?a :ppp2 ?d . > } > > because the first form looks like > > (join (bgp ?a :ppp1 ?c) > (left-join identity (bgp ?a :ppp2 ?d)) > > and the left-join will discard no solutions at run-time leaving the outer-most join to merge the two patterns and _discard_ any ?a’s that don’t have a :ppp2 triple. Consider: Data: one triple: :a :ppp1 :c . Eval: Inner: ?a :ppp1 ?c . => ?a=:a ?c=:c OPTIONAL { ?a :ppp2 ?d } -> identity then (join [?a=:a ?c=:c] identity) => [?a=:a ?c=:c] Notice ?d. select * { ?a :ppp1 ?c . ?a :ppp2 ?d . } will fail if there is no :ppp2 because {?a :ppp2 ?d .} => join-zero where join-zero is a table of no rows and (join X join-zero) = join-zero (actually, matching BGPs is not joining but for simple entailment it is the same). > > This follows pretty clearly from the SPARQL definition but also seems surprising from a query writers perspective. > > A second example occurs with BIND. > > select * { > ?a :ppp1 ?c . > bind( 2 * ?c as ?twiceC ) > } > > is _not_ the same as > > select * { > ?a :ppp1 ?c . > { > bind( 2 * ?c as ?twiceC ) > } > } > > The second form will leave ?twiceC unbound as ?c has no binding inside of the BIND. Yes. BIND is not like FILTER and this can be confusing. > Note that this also means that you cannot distribute common patterns out a union because > > select * { > { > ?x a ?p . > ?p ex:bar ?foo . > } > union > { > ?x a ?p . > bind(?p as ?foo) > } > } > > is _not_ the same as > > select * { > ?x a ?p . > { > ?p ex:bar ?foo . > } > union > { > bind(?p as ?foo) > } > } > > > If my examples are correct, then SPARQL seems more difficult than it should be to reason about than it should be. I’d welcome comments and thoughts. > Agreed - the syntax evolved, and needs to satisfy very different communities. Maybe the syntax should have been explicitly functional style but that does not speak well to a different part of the user community. BIND didn't exist in SPARQL 1.0. It is tempting to read it from top to bottom, which was the evaluation model for the first version of SPARQL 1.0. The community decided it wanted a relational algebra style and hence leverage all the work on relational algebra optimization. Ironically, some engines now execute top down, at least in part, which is effectively an index join when the scoping of variables does not get in the way, which is most of the time. It is space efficient. Andy > thanks, > -- > Gary Warren King, metabang.com > Cell: (413) 559 8738 > Fax: (206) 338-4052 > gwkkwg on Skype * garethsan on AIM * gwking on twitter > >
Received on Thursday, 17 December 2015 13:27:58 UTC