- From: Steve Harris <steve.harris@garlik.com>
- Date: Thu, 28 Jun 2007 15:53:35 +0100
- To: andy.seaborne@hp.com
- Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
On 28 Jun 2007, at 14:10, Seaborne, Andy wrote: > > > Steve Harris wrote: >>> This is not the same case as the next example because of teh >>> extra {} below: This is: >>> >>> (leftjoin >>> (BGP [triple ?person :department ?dept]) >>> (BGP [triple ?person :project ?proj]) >>> (= ?dept <http://example/technology>)) >> Yeah, Lee explained that to me on IRC, I meant to post something >> saying what I'd missed last night, but forgot. Sorry. >> I have {} blindness apparently, my mental parser just chops them out. >> I do regard it as odd (and undesirable) that >> :x :p ?v . { :x :q ?w . OPTIONAL { :x :p ?v2 } FILTER (...) } >> is different to >> :x :p ?v . :x :q ?w . OPTIONAL { :x :p ?v2 FILTER (...) } >> but that's maybe just down to my defective mental parser :) > > It's a consequence of bottom-up evaluation. The {} have scope > implications. Yes, but only in some cases it seems. > Consider evaluating some function, which has two arguments each of > which use a variable x (could be a relational algebra but the point > is general evaluation) > > function(A(x), B(x)) > > Does B(x) see the value of x from A(x)? Bottom up, the answer is > "no": the evaluation of A(x) and B(x) are isolated from each other. Well, it depends what you mean by "see" surely? It's an implementation decision whether you pass values of x on, you can do A (x1), b(x2), x = x1 U x2 if you like. SPARQL is still supposed to be a declarative language AFAICT. >>> ?dept is in scope because it becomes part of the left join >>> condition. >>> >>> ?person :project ?proj is included when (?dept = :technology). >>> >>>> Data: >>>> @prefix : <http://example/> . >>>> :x :p 1 . >>>> :x :p 2 . >>>> :x :p 3 . >>>> :x :p 4 . >>>> :x :q 1 . >>>> :x :q 2 . >>>> :x :q 3 . >>>> Query: >>>> PREFIX : <http://example/> >>>> SELECT * >>>> { >>>> :x :p ?v . >>>> { :x :q ?w >>>> OPTIONAL { :x :p ?v2 FILTER(?v = 1) } >>>> } >>>> } >>> Because of the extra {} this is a different structure to the >>> first example: >>> >>> (join >>> (BGP [triple :x :p ?v]) >>> (leftjoin >>> (BGP [triple :x :q ?w]) >>> (BGP [triple :x :p ?v2]) >>> (= ?v 1)) >>> ) >> Or in slightly more steve-friendly notation: >> (triple :x :p ?v) [X] ( (triple :x :q ?w) ]X] (triple :x :p ?v2) s >> (?v = 1) ) > > :-) > > But note that SPARQL LeftJoin has three arguments, left and right > table,s and the leftjoin condition (left theta join). The > condition can access teh variables in both left and right tables. SPARQL OPTIONALs are not generally theta joins, but I'll let that one pass :) I think (hope) it's true that OPTIONAL(A, B, F) = (A ]X] (B s(F))), modulo the scope thing. >> but actually I think it must be equivalent to >> (triple :x :p ?v) [X] ( (triple :x :q ?w) ]X] ( (triple :x :p ?v2) >> s(? v = 1) ) ) >> Otherwise there would be no bindings >>>> What I would expect: (excuse the use of NULL) >>> I'm having difficulty at this point : Could you explain why you >>> expect 48 (4*3*4) results all of which have ?v2 null? Even old >>> style, that wouldn't happen (i.e. get 4 matches when ?v = 1 and >>> one otherwise). >> It's to do with my execution method. I do all the joins, then >> apply the FILTERs afterwards, this used to be fine, and was a >> valid optimisation under the old logic, but isn't anymore. >> [ it might seem like an odd optimisation, but it's possible for me >> to perform many algebraic joins in a very parallel way, but not >> FILTERs, so the FILTERs would end up serialising everything ] > > Not odd at all - I know SQL optimizers that do exactly the same for > exactly the same reasons. OK, well in any case I'll have to do inline FILTER evaluation in some cases, in order to get the cardinality correct. I think in the absence of FILTERs inside OPTIONAL/UNIONs I'm still OK to apply FILTERs at the end. >>> It seems to have include { :x :p ?v2 } matches multiple times >>> even when ?v != 1 And for whatever evaluation we have, why is ? >>> v2 null when ?v = 1? >> Well, I don't do enough inspection to know that ?v = 1 cannot >> possibly be true, so I evaluate the join, then later, by >> inspection I can tell that the OPTIONAL branch cannot succeed, >> as ?v = 1 must always fail. But I've already done the join at >> that point. >> I now get that this optimisation cannot be applied in the general >> case anymore (with REDUCED, if I make things DISTINCT internally >> I think it's still valid), I'll have to add some code that >> determines when it can be applied and when I have to halt and >> inline the FILTER evaluation. >> Unfortunately the current version of Dave's parser flattens out >> plain {} expressions, which made query processing easier, but >> there's not enough information left to tell the difference. >> Changing that is probably going to break my execution engine, but >> it seems like it needs a radical overhaul anyway. >>> In evaluation terms, the extra nesting given by the {} introduce >>> a JOIN. >> There are the same number of triple patterns, so the same number >> of joins, surely? > > Same number of joins - different scoping. Basic graph patterns are > not merely a join of the triple patterns until filters have been > moved around. Nor if it's not simple entailment. > >> I also don't get why it's desirable to have {}s affect the scope >> of variables, especially when OPTIONAL{} and GRAPH{} don't. > > They do act in the same way. > > { pattern1 } join { pattern2 } > > the evaluation of operator lets variables in the left and right > equate. OPTIONAL and GRAPH are no different really. Well, they behave differently. (1) ?x :p ?y { ?x :r ?w } OPTIONAL { ?x :q ?z } and (2) ?x :p ?y OPTIONAL { ?x :r ?w } OPTIONAL{ ?x :q ?z } are different as far as I can tell. IIUC the righthand optional only shares the middle ?x in (1) and both in (2)? It seems to me that the whole language would be much easier to understand and implement if there were no scope. It's my feeling that the CR document does not explain them adequately in any case. I've read every section that mentions "scope" (mostly concerning only bNodes AFAICT) twice, and the whole document once through, and I still don't see where it's defined, or see an illuminating example. - Steve
Received on Thursday, 28 June 2007 14:53:46 UTC