Re: from Seaborne, Andy on 2006-12-07 (public-rdf-dawg-comments@w3.org from December 2006)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Thu, 07 Dec 2006 17:02:10 +0000
To: jperez@ing.puc.cl
CC: public-rdf-dawg-comments@w3.org
Message-ID: <45784912.1040804@hp.com>
Jorge Pérez wrote:
 >
 > Only to state my point again:
 >
 > I think that the non-standard definition of Diff and LeftJoin having an
 > expression as one of the arguments, only to provide users with a *one
 > level* conditional OPT, is a bad design decision, and will raise a lot of
 > problems in understanding and implementations in the future. How you
 > explain users that they can make a *one level* conditional OPT and cannot
 > make a *two level* conditional OPT? and cannot make somethig similar with
 > other operators?
 >
 > http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2006Oct/0032.html
 >
 > - jorge

Firstly, let's note that we are talking about queries you characterised in the
"well-designed" terminology as being not "well-designed".

In compositional semantics, double nested optionals with a variable not
mentioned in between, are treated as not allowing the inner use of the
variable to be influenced by the outer use.  So we already have a one level
relationship with just graph patterns.

[I've never seen doubly nested queries except for examples in these 
discussions so changing it's semantics is acceptable.  I've never seen the 
non-filter case arise either.]


I view the nesting of FILTERs as the same situation.

{ ?x :q ?v . OPTIONAL { ?x :p ?w } }

{ ?x :q ?v . OPTIONAL { ?y :p ?w FILTER(?x = ?y) } }


A one-level optional in a nested OPTIONAL expression is what compositional 
semantics gets you for graph patterns, and having much the same for filters is 
therefore what I see as natural.  It is already being proposed to change the 
semantics for the non-filter case there to be compositional; the  FILTER 
situation is analogous - if the variable isn't mentioned in the intermediate 
level it isn't in-scope to the filter.  I would find it most strange if the 
rules for FILTER and mentioned variables did not correspond.


There are 3 classes of approach:

1/ Define LeftJoin to involve an expression
2/ Define A OPT B as LeftJoin(A, Join(A,B)) done as a top-down transformation.
3/ Don't allow it and require the application to repeat the pattern if 
necessary as you described (2006Oct/0030.html)

(2) is a class of approaches that copies the left side down the right side so
will give the original declarative semantics effect with compositional
evaluation.  There has been little support for this which I take to be a
willingness to change the "doubly-nested OPTIONALS without all variables
mentioned in each level" case.  For queries not involving 2+ nesting or ones
without the lack of intermediate mention, the evaluation could be done as
LeftJoin(A, B) and rely on the query processor to spot this case as an
optimization.

[I haven't proved case 2 - I have just worked through some examples - because
there was no enthusiasm for it as an approach, whether it worked or not, so it
wasn't worth pursuing.]

For case 1, expressions make the common use case of one optional work out 
naturally, they accord with the pattern rules (in my opinion) and is what 
implementations are currently doing.  The fact it is somewhat analogous to 
SQL's LEFT OUTER JOIN with ON is a bonus.

I don't agree with your statement about costs in understanding and 
implementing; I see it as treating variables in the FILTER on an equal footing 
as in patterns.  If FILTERs were constructive, not restrictive, there would be 
no issue: putting the FILTER condition into the LeftJoin operation is the same 
as having the the natural join relationship as the join condition.  It's like 
a theta-join. (If FILTERs were purely constructive we would get infinities of 
solutions appearing and have to then restrict that in some way.  And various 
other issues.)

For consistency of style, I did consider using a theta-join style throughout
but decided it was too complex for any benefit of the consistency.

For case 3, I view it as pushing the burden on to the application writer as
undesirable.  Patterns will not just be single triple patterns.  These simple 
query forms (one optional, filter in LHS mentioning a variable from the RHS - 
this is not about nesting) are occurring and applications writers do find them 
usable.  Case 2 is a way of moving the burden off the application writer 
because it performs the transformation always but redundantly much of the time.

	Andy
Received on Thursday, 7 December 2006 17:02:31 UTC