- From: Lee Feigenbaum <feigenbl@us.ibm.com>
- Date: Mon, 28 Aug 2006 16:22:56 -0400
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Hi everyone,
I know that we've discussed this -- or similar -- before, but my searching
hasn't found anything conclusive/authoritative. I've been working through
some situations involving FILTERs in my implementation and
cross-referencing with ARQ's behavior in SPARQLer (http://www.sparql.org)
and also trying to work through the "proper" behavior according to the
spec.
... And I'm having a lot of difficulties :)
First, I find the spec (I'm working off of rq24 but don't believe this is
any different for rq23 or the published CR) unclear as to what the scope
of a FILTER is. The text in section 11 reads:
"""
Specifically, FILTERs eliminate any solutions that, when substituted into
the expression, either result in an effective boolean value of false or
produce an error.
"""
It does not explicitly state what solutions a FILTER is tested against, so
I see two possibilities:
1) a FILTER operates globally, regardless of where it occurs in a query
(across UNIONs, GRAPHs, groups, OPTIONALs, BGPs, etc.)
2) a FILTER is limited to restricting the solutions of the
FilteredBasicGraphPattern in which it occurs
I suspect #2 is the intended meaning, but because it is tied so closely to
the grammar and because this part of the grammar is defined recursively (
FilteredBasicGraphPattern ::= BlockOfTriples? ( Constraint '.'?
FilteredBasicGraphPattern )? ), this seems to make the behavior of
FILTERs order-dependent.
>From some experiments with SPARQLer, I'm still at a bit of a loss:
Data (available at: http://thefigtrees.net/lee/sw/data/g.n3 )
@prefix : <http://example.org/> .
:s1 :p1 :o1 .
:s2 :p2 :o2 .
And consider the query (query #1):
PREFIX : <http://example.org/>
SELECT *
FROM <http://thefigtrees.net/lee/sw/data/g.n3>
{
?s :p1 :o1 .
?t :p2 :o2 .
}
which has solution:
?s ?t
-- --
:s1 :s2
Now, we add a FILTER (query #2):
PREFIX : <http://example.org/>
SELECT *
FROM <http://thefigtrees.net/lee/sw/data/g.n3>
{
?s :p1 :o1 .
?t :p2 :o2 .
FILTER (?t = :s2) .
}
and we get the same single solution:
?s ?t
-- --
:s1 :s2
Now, we change the position of the FILTER (query #3):
PREFIX : <http://example.org/>
SELECT *
FROM <http://thefigtrees.net/lee/sw/data/g.n3>
{
FILTER (?t = :s2) .
?s :p1 :o1 .
?t :p2 :o2 .
}
and now we get *zero* solutions.
So my first questions are: is this behavior what people expect from
SPARQL? is this behavior somehow explained by the current spec text?
>From the SPARQL grammar's point of view, I believe the first of these is:
FilteredBasicGraphPattern(BlockOfTriples(...), Constraint(...),
FilteredBasicGraphPattern())
while the second is
FilteredBasicGraphPattern(Constraint(...),
FilteredBasicGraphPattern(BlockOfTriples(...)))
I would think in both cases that the two triples occur in the same BGP,
and so the solution {?s/:s1, ?t/:s2} should be FILTERed, and not
eliminated. The observed behavior of SPARQLer in query #3 could only occur
(I think) if the FILTER is applied individually to the solutions from each
triple pattern before those solutions are joined together.
Things get more confusing (for me, at least). Consider:
http://www.w3.org/2001/sw/DataAccess/tests/#dawg-bound-query-001
Data:
@prefix : <http://example.org/ns#> .
:a1 :b :c1 .
:c1 :d :e .
:a2 :b :c2 .
:c2 :b :f .
Query (query #4):
PREFIX : <http://example.org/ns#>
SELECT ?a ?c
WHERE {
?a :b ?c .
OPTIONAL { ?c :d ?e } .
FILTER (! bound(?e))
}
Expected results:
?c ?a
-- --
<http://example.org/ns#c2> <http://example.org/ns#a2>
<http://example.org/ns#f> <http://example.org/ns#c2>
These results rely on the three solutions of the OPTIONAL binary-operator
being FILTERED by the filter. But the OPTIONAL does not appear in the same
FilteredBasicGraphPattern as the FILTER constraint (of course, that's not
physically possible). The parse tree for this is something akin to:
Group( Optional(BGP(?a :b ?c), BGP(?c :d ?e)), FilteredBGP(BGP(),
Constraint(!bound(?e))) )
It would require a reading of the spec that takes filters that appear as
siblings within a group and applies them to all other siblings in a group
to get the behavior expected by this test (SPARQLer conforms to the
behavior expected by this test). But I have a very hard time seeing how
any explanation for the behavior expected by this test can also have the
diverging behavior that I see from SPARQLer in query #2 and query #3.
...
I could go on (e.g., everyone probably agrees that FILTERs don't affect
solutions in other branches of a UNION), but I don't think I'll enlighten
myself any further at this point and it would probably just make this
message more rambling, so I'll stop for now and hope to be corrected as to
what I'm missing or at least pointed to discussions of this which I'm
pretty sure have occurred before.
thanks,
Lee
Received on Monday, 28 August 2006 20:23:08 UTC