the LHS of OPTIONAL

(I promised Andy a note on my concerns about OPTIONAL, so here it is. 
Opinions sought.)

Conventions:

  Ti : one or more consecutive triple patterns
  Fi : a FILTER
  Group(T1, T2, ..., F) : a join of the Ti triple patterns constrained by 
the filter expression F


Some examples:

{
  T1 .
  F1 .
  T2 .
  F2 .
}

is identical to:

{
  T1.
  T2.
  F1 .
  F2.
}

and also identical to:

{
  F1.
  F2.
  T1.
  T2.
}

That is, (inner) join is independent of order. When we add OPTIONAL in, 
things become a bit more complicated.

{
  T1 .
  OPTIONAL { T2 } .
}

parses as Group( Optional(T1, Group(T2)) ).

{
  T1 .
  F1 .
  OPTIONAL { T2 } .
}

parses as Group( Optional(Group(T1, F1), Group(T2)) ). Effectively, the 
OPTIONAL creates an implicit Group scope (curly braces) of the stuff that 
comes before it. So:

{
  T1 .
  F1 .
  OPTIONAL { T3 } .
  T2 .
  F2 .
}

parses as Group( Optional(Group(T1, F1), Group(T3)), T2, F2 ) and 
consequently is not the same as the reordered:

{
  T1 .
  F1 .
  F2 .
  OPTIONAL { T3 } .
  T2 .
}

which parses as Group( Optional( Group( T1, F1 && F2 ), Group(T3) ), T2 ). 
Note in particular that F1 has a different scope in the two scenarios, due 
to the extra group implicitly created to hold the LHS of the OPTIONAL. To 
me, this is very unnatural and unintuitive. Given the LHS-of-OPTIONAL rule 
of "slurp up everything before OPTIONAL", I'm also unclear as to the 
correct parse of:

{
  { T1 } UNION { T2 } .
  T3 .
  OPTIONAL { T4 . }
}

Is this:
  Group( Optional( Group( Union(T1, T2), T3), Group(T4) ) ) )
or
  Group( Union(T1, T2), Optional( Group(T3), Group(T4) ) )

That is, does OPTIONAL pull in everything before it (but in the same 
group) to form the implicit LHS group, or just the nearest bunch of triple 
patterns and filters? I suspect the former, but find this confusing.

Fred Z. and I have both advanced the idea that perhaps SPARQL should 
require that the LHS of the OPTIONAL be explicitly demarcated by mandatory 
curly braces. In our opinion (OK, in my opinion, but I'm guessing Fred 
shares it), this would alleviate any confusion and make complex queries 
more readable.

>From conversations with Andy, I believe that he disagrees that mandatory 
curlies would make queries more readable (believing instead that the extra 
curlies would make reading queries more difficult) and he is also worried 
about the significant number of existing SPARQL queries that would become 
invalid.

I share this latter concern and have not come to a decision with myself as 
to whether or not that should be a show-stopper at this point for this 
potential design change. In the meantime, I wanted to share my concerns 
and I'd like to solicit opinions of other members of the WG.

thanks,
Lee

Received on Monday, 20 November 2006 17:12:34 UTC