Re: Filter scope and optionals from Jorge Pérez on 2006-10-31 (public-rdf-dawg-comments@w3.org from October 2006)

From: Jorge Pérez <jperez@ing.puc.cl>
Date: Tue, 31 Oct 2006 11:50:14 -0300 (CLST)
To: andy.seaborne@hp.com
Cc: public-rdf-dawg-comments@w3.org
Message-ID: <57725.146.155.4.12.1162306214.squirrel@mail.ing.puc.cl>
Seaborne, Andy wrote:
>
>
>
> Jorge Pérez wrote:
>> Hi Andy,
>>
>>> If the proposed filter scoping rule is taken naively, there isn't a way
>>> to
>>> get
>>> value testing conditions into the left join conditions for optional
>>> (unlike
>>> the join conditions for a group).
>>>
>>> Here's an example:
>>>
>>> ---- Data
>>> @prefix   :         <http://example/> .
>>>
>>> :x1 :p 1 .
>>> :x2 :p 2 .
>>>
>>> :x3 :q 3 .
>>> :x3 :q 4 .
>>> ---- Data
>>>
>>>
>>> ---- Query
>>> PREFIX :    <http://example/>
>>>
>>> SELECT *
>>> {
>>>    ?x :p ?v .
>>>    OPTIONAL
>>>    {
>>>      ?y :q ?w .
>>>      FILTER(?v=2)
>>>    }
>>> }
>>> ---- Query
>>
>> I personally think that this kind of queries are interesting from the
>> user
>> point of view, but because they need a *dynamic scope* evaluation (I
>> really don't know how to call it, but I think you understand) the
>> language
>> looses the compositionality. What I mean is that the *expression*
>>
>> {
>>   ?y :q ?w .
>>   FILTER(?v=2)
>> }
>>
>> has different meanings depending on where it appears on the query. Here
>> you are discussing the appearance of it in the right hand side of an
>> OPTIONAL, but it would appear in a parallel group, in a UNION, inside a
>> GRAPH, etc. Also here the expression in the filter is very simple but I
>> think that with bound\!bound things could be more complex.
>>
>>>
>>> ---- Results 1
>>> ---------------------
>>> | x   | v | y   | w |
>>> ====================>> | :x2 | 2 | :x3 | 4 |
>>> | :x2 | 2 | :x3 | 3 |
>>> | :x1 | 1 |     |   |
>>> ---------------------
>>>
>>> ---- Results 2
>>> -------------------
>>> | x   | v | y | w |
>>> ==================>> | :x2 | 2 |   |   |
>>> | :x1 | 1 |   |   |
>>> -------------------
>>>
>>> Results 1 is what the current doc (declarative form) gets - and would
>>> be
>>> the
>>> same if the FILTER were part of the left outer join conditions.
>>>
>>> Results 2 is what a purely filter-scoped-to-group gets because the
>>> optional
>>> part never matches because ?v is an unbound variable.
>>>
>>>
>>> I can't see any way of getting filter into the scope of the left outer
>>> join.
>>> The only way I can see of rewriting this query is to use a UNION and
>>> repeat
>>> the fixed pattern on each side of the union, one side including the
>>> filter
>>> and
>>>   optional triple pattern. That query is going to give different
>>> answers
>>> and
>>> the app will have to post-process the results to find out what went on.
>>
>> And the following query?
>>
>> SELECT *
>> {
>>    ?x :p ?v .
>>    OPTIONAL
>>    {
>>      ?x :p ?v .
>>      ?y :q ?w .
>>      FILTER(?v=2)
>>    }
>> }
>>
>> I think (but I do not have a proof of it), that giving a dinamic scope
>> for
>> filters in optionals does not increase the expressiveness of the
>> language
>> compared with the compositional-algebraic-operations approach, and one
>> can
>> always obtain the same result repeating the pattern inside the optional
>> (as in the above query).
>
> It may not change the expressivity but it isn't convenient for the
> application
> writer.  The example here has one triple pattern - in a realistic query,
> it
> would be 10's of triple patterns.  Repeating it would be a significant
> burden
> and potentially error prone, for example if the repeated pattern needs
> changing.
>
> It does suggest an approach though, which is to define optional, not
> directly
> as a left join operation but to do that repetition:
>
> OPT(A,B) = LeftJoin(A, Join(A, B))
>
> but filters need to be handled so that itself does not work.
>
> I haven't worked though properly but an idea:
>
> Write Group(P,F) where P a pattern, F is a filter restriction,
> scope filters over the evaluation of the pattern:
> then Group(P,F) = F(P) = [[ [[P]] FILTER F ]]
>
> so:
>
> OPT(Group(P1, F1), Group(P2,F2))      LeftJoin(F1(P1), F2( Join(F1(P1),
> P2) ))
>
> and F2 can mention variables in P1
>
> which is  F1(P1) LeftJoin (F1(P1) Join P2) ON (F2 and matching vars)
> (in a vaguely SQL-ish notation)
>
> which looks like it might be  F1(P1) LeftJoin P2 ON (F2 and matching vars)
>
> If F2 only mentions variables from P2 and not P1, then it can be moved in
> to
> apply directly to P2.
>
> If F2 mentions a variable x from P1 and P2, then x must have the same
> value in
> each so its the same as applying F2 to P2 alone.
>
> (Caveat - design-by-email.  Further work and testing needed!)
>
>
>>
>>> Intuitively, the scope rule needs to be modified to put the filter just
>>> outside the group, allowing it to be part of the left outer join, and
>>> not
>>> just
>>> inside the group.
>>
>> I think it is not so simple, because you can have patterns like
>>
>> { ?x :p ?v
>>   OPTIONAL {
>>      ?y :p ?w
>>      OPTIONAL {
>>         ?z :q ?u .
>>         FILTER (?v=2) } } }
>>
>> and then the problem affect the semantics of nested optionals too.
>
> I wasn't thinking of extending the scoping to allow this.  I don't see it
> happening for real nearly as much as the one level form above.
>
> By allowing the scope to be just outside the containing optional part, the
> FILTER in the innermost optional does not have access to the outermost "?x
> :p
> ?v" and ?v is unbound for the restriction.

That is why I think it would be strange, why the scope is only the
containing optional part, and not all the *path of optionals*? and what
happend with parallel optionals, with UNION, with groups? an example

{
  { ?x :p ?v .
      { ?y :p ?w . FILTER (?v = 2) }
  }
}

I think it would be a bad design to change the scope of filter depending
on where and how they are used. I think it would be strange for users and
for developers (even in the case that it just change for one level of
OPTIONAL).

What I've called *dynamic* scope for filter is interesting, intuitivelly
the scope of the filter is inherited from the containing groups. The
problem is that it is incompatible with a pure algebraic approach.

- jorge

>
>
>>
>>> Any alternatives?
>>> How do we formalize this?
>>
>> One approach is to formalize OPTIONAL + FILTER in a navigational way (as
>> the current spec).
>> Another approach to formalize the dinamic scoping (without loosing
>> compositionality) is to think that the filter is *generating bindings*,
>> for example from
>>
>> {
>>   ?y :q ?w .
>>   FILTER(?v=2)
>> }
>>
>> results the solutions for ?y :q ?w defined also in variable ?v but with
>> fixed value 2 for ?v. What I do not like of this approach is that FILTER
>> is not filtering, it is *forcing* some bindings, and formal definitions
>> would become complex in the case of restrictions like ?x = ?y, or
>> restrictions with negations.
>
> I don't like this either (FILTER(?v < 3) for example) but it does show the
> asymmetry here: variables mentioned in the pattern play a part in the
> left-join condition but variables mentioned in the filter don't.
>
>>
>> I'm personally in favor of the Results 2 above (filter-scoped-to-group).
>>
>> - jorge
>>
>>
>>> 	Andy
>
> Thanks for the prompt comments
>
> 	Andy
>
>>>
>>>
>>
>>
>
>
>
Received on Tuesday, 31 October 2006 15:15:31 UTC