- From: Steve Harris <steve.harris@garlik.com>
- Date: Thu, 28 Jun 2007 11:25:08 +0100
- To: andy.seaborne@hp.com
- Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
On 27 Jun 2007, at 17:52, Seaborne, Andy wrote: > > > Steve Harris wrote: >> I've a question regarding the filter-scope-1 test, I would have >> expected the results below, which would be equivalent to FILTER >> (0) after http://www.w3.org/TR/rdf-sparql-query/ >> ยง 6.2, but instead I see something equivalent to if the OPTIONAL >> block were missing entirely. >> I can't find anything in the spec which suggests that evaluating >> a FILTER() expression with out of scope variables should cause >> the entire enclosing block to be disregarded. Though, I may have >> just missed it. >> I'm also not sure I like this scoping rule, it rules out some >> potentially useful query forms, like: >> SELECT * >> WHERE { >> ?person :department ?dept . >> OPTIONAL { >> ?person :project ?proj . >> FILTER(?dept = :technology) >> } >> } > > This is not the same case as the next example because of teh extra > {} below: This is: > > (leftjoin > (BGP [triple ?person :department ?dept]) > (BGP [triple ?person :project ?proj]) > (= ?dept <http://example/technology>)) Yeah, Lee explained that to me on IRC, I meant to post something saying what I'd missed last night, but forgot. Sorry. I have {} blindness apparently, my mental parser just chops them out. I do regard it as odd (and undesirable) that :x :p ?v . { :x :q ?w . OPTIONAL { :x :p ?v2 } FILTER (...) } is different to :x :p ?v . :x :q ?w . OPTIONAL { :x :p ?v2 FILTER (...) } but that's maybe just down to my defective mental parser :) > ?dept is in scope because it becomes part of the left join condition. > > ?person :project ?proj is included when (?dept = :technology). > >> Data: >> @prefix : <http://example/> . >> :x :p 1 . >> :x :p 2 . >> :x :p 3 . >> :x :p 4 . >> :x :q 1 . >> :x :q 2 . >> :x :q 3 . >> Query: >> PREFIX : <http://example/> >> SELECT * >> { >> :x :p ?v . >> { :x :q ?w >> OPTIONAL { :x :p ?v2 FILTER(?v = 1) } >> } >> } > > Because of the extra {} this is a different structure to the first > example: > > (join > (BGP [triple :x :p ?v]) > (leftjoin > (BGP [triple :x :q ?w]) > (BGP [triple :x :p ?v2]) > (= ?v 1)) > ) Or in slightly more steve-friendly notation: (triple :x :p ?v) [X] ( (triple :x :q ?w) ]X] (triple :x :p ?v2) s(?v = 1) ) but actually I think it must be equivalent to (triple :x :p ?v) [X] ( (triple :x :q ?w) ]X] ( (triple :x :p ?v2) s(? v = 1) ) ) Otherwise there would be no bindings >> What I would expect: (excuse the use of NULL) > > I'm having difficulty at this point : Could you explain why you > expect 48 (4*3*4) results all of which have ?v2 null? Even old > style, that wouldn't happen (i.e. get 4 matches when ?v = 1 and one > otherwise). It's to do with my execution method. I do all the joins, then apply the FILTERs afterwards, this used to be fine, and was a valid optimisation under the old logic, but isn't anymore. [ it might seem like an odd optimisation, but it's possible for me to perform many algebraic joins in a very parallel way, but not FILTERs, so the FILTERs would end up serialising everything ] > It seems to have include { :x :p ?v2 } matches multiple times even > when ?v != 1 And for whatever evaluation we have, why is ?v2 null > when ?v = 1? Well, I don't do enough inspection to know that ?v = 1 cannot possibly be true, so I evaluate the join, then later, by inspection I can tell that the OPTIONAL branch cannot succeed, as ?v = 1 must always fail. But I've already done the join at that point. I now get that this optimisation cannot be applied in the general case anymore (with REDUCED, if I make things DISTINCT internally I think it's still valid), I'll have to add some code that determines when it can be applied and when I have to halt and inline the FILTER evaluation. Unfortunately the current version of Dave's parser flattens out plain {} expressions, which made query processing easier, but there's not enough information left to tell the difference. Changing that is probably going to break my execution engine, but it seems like it needs a radical overhaul anyway. > In evaluation terms, the extra nesting given by the {} introduce a > JOIN. There are the same number of triple patterns, so the same number of joins, surely? I also don't get why it's desirable to have {}s affect the scope of variables, especially when OPTIONAL{} and GRAPH{} don't. Lee said it's something to do with having different FILTER expressions in different branches of a UNION, but that doesn't seem like a good enough reason to add such an apparently arbitrary scoping rule. There's probably some underlying logic, but it's not clear to me what it is. - Steve -- Steve Harris Garlik Limited Gainsborough House 2 Sheen Road Richmond TW9 1AE T +44(0)20 8973 2465 F +44(0)20 8973 2301 www.garlik.com Registered in England and Wales 535 7233 VAT # 849 0517 11 Registered office: 42-46 High Street, Esher, Surrey KT10 9QY
Received on Thursday, 28 June 2007 10:25:44 UTC