- From: Lee Feigenbaum <feigenbl@us.ibm.com>
- Date: Mon, 28 Aug 2006 16:22:56 -0400
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Hi everyone, I know that we've discussed this -- or similar -- before, but my searching hasn't found anything conclusive/authoritative. I've been working through some situations involving FILTERs in my implementation and cross-referencing with ARQ's behavior in SPARQLer (http://www.sparql.org) and also trying to work through the "proper" behavior according to the spec. ... And I'm having a lot of difficulties :) First, I find the spec (I'm working off of rq24 but don't believe this is any different for rq23 or the published CR) unclear as to what the scope of a FILTER is. The text in section 11 reads: """ Specifically, FILTERs eliminate any solutions that, when substituted into the expression, either result in an effective boolean value of false or produce an error. """ It does not explicitly state what solutions a FILTER is tested against, so I see two possibilities: 1) a FILTER operates globally, regardless of where it occurs in a query (across UNIONs, GRAPHs, groups, OPTIONALs, BGPs, etc.) 2) a FILTER is limited to restricting the solutions of the FilteredBasicGraphPattern in which it occurs I suspect #2 is the intended meaning, but because it is tied so closely to the grammar and because this part of the grammar is defined recursively ( FilteredBasicGraphPattern ::= BlockOfTriples? ( Constraint '.'? FilteredBasicGraphPattern )? ), this seems to make the behavior of FILTERs order-dependent. >From some experiments with SPARQLer, I'm still at a bit of a loss: Data (available at: http://thefigtrees.net/lee/sw/data/g.n3 ) @prefix : <http://example.org/> . :s1 :p1 :o1 . :s2 :p2 :o2 . And consider the query (query #1): PREFIX : <http://example.org/> SELECT * FROM <http://thefigtrees.net/lee/sw/data/g.n3> { ?s :p1 :o1 . ?t :p2 :o2 . } which has solution: ?s ?t -- -- :s1 :s2 Now, we add a FILTER (query #2): PREFIX : <http://example.org/> SELECT * FROM <http://thefigtrees.net/lee/sw/data/g.n3> { ?s :p1 :o1 . ?t :p2 :o2 . FILTER (?t = :s2) . } and we get the same single solution: ?s ?t -- -- :s1 :s2 Now, we change the position of the FILTER (query #3): PREFIX : <http://example.org/> SELECT * FROM <http://thefigtrees.net/lee/sw/data/g.n3> { FILTER (?t = :s2) . ?s :p1 :o1 . ?t :p2 :o2 . } and now we get *zero* solutions. So my first questions are: is this behavior what people expect from SPARQL? is this behavior somehow explained by the current spec text? >From the SPARQL grammar's point of view, I believe the first of these is: FilteredBasicGraphPattern(BlockOfTriples(...), Constraint(...), FilteredBasicGraphPattern()) while the second is FilteredBasicGraphPattern(Constraint(...), FilteredBasicGraphPattern(BlockOfTriples(...))) I would think in both cases that the two triples occur in the same BGP, and so the solution {?s/:s1, ?t/:s2} should be FILTERed, and not eliminated. The observed behavior of SPARQLer in query #3 could only occur (I think) if the FILTER is applied individually to the solutions from each triple pattern before those solutions are joined together. Things get more confusing (for me, at least). Consider: http://www.w3.org/2001/sw/DataAccess/tests/#dawg-bound-query-001 Data: @prefix : <http://example.org/ns#> . :a1 :b :c1 . :c1 :d :e . :a2 :b :c2 . :c2 :b :f . Query (query #4): PREFIX : <http://example.org/ns#> SELECT ?a ?c WHERE { ?a :b ?c . OPTIONAL { ?c :d ?e } . FILTER (! bound(?e)) } Expected results: ?c ?a -- -- <http://example.org/ns#c2> <http://example.org/ns#a2> <http://example.org/ns#f> <http://example.org/ns#c2> These results rely on the three solutions of the OPTIONAL binary-operator being FILTERED by the filter. But the OPTIONAL does not appear in the same FilteredBasicGraphPattern as the FILTER constraint (of course, that's not physically possible). The parse tree for this is something akin to: Group( Optional(BGP(?a :b ?c), BGP(?c :d ?e)), FilteredBGP(BGP(), Constraint(!bound(?e))) ) It would require a reading of the spec that takes filters that appear as siblings within a group and applies them to all other siblings in a group to get the behavior expected by this test (SPARQLer conforms to the behavior expected by this test). But I have a very hard time seeing how any explanation for the behavior expected by this test can also have the diverging behavior that I see from SPARQLer in query #2 and query #3. ... I could go on (e.g., everyone probably agrees that FILTERs don't affect solutions in other branches of a UNION), but I don't think I'll enlighten myself any further at this point and it would probably just make this message more rambling, so I'll stop for now and hope to be corrected as to what I'm missing or at least pointed to discussions of this which I'm pretty sure have occurred before. thanks, Lee
Received on Monday, 28 August 2006 20:23:08 UTC