Re: Issues with evaluating optional: ...

Hi all!

In http://lists.w3.org/Archives/Public/public-rdf-dawg/2006OctDec/0064.html

Fred Zemke wrotes:
> I would welcome comments from the Chileans, since their
> work sparked this thread.

Because I'm one of the Chileans, I'm responding to this ;-)

First, in SCS we follow a more standard algebraic syntax which simplifies
a lot of things (for example we have no problem in identifying what is the
first operand of OPTIONAL!). The official spec relies on implicit
association and precedence of operators (I say implicit because one can
only knows them following the grammar). Then when converting

> Q1: WHERE { ?X :n :p . ?Y :n :g OPTIONAL { ?X :c ?Z } }

it results in

P1: ( (?X :n :p) AND (?Y :n :g) )   OPTIONAL   ( ?X :c ?Z) )

because '.' has precedence over OPTIONAL in the grammar. And then

> Q2: WHERE { ?Y :n :g OPTIONAL { ?X :e ?Z } ?X :n :p }

results in

P2: ( (?Y :n :g) OPTIONAL (?X :e ?Z) )   AND   (?X :n :p )

because OPTIONAL *brakes* the query in two groups. Following the
compositional and operational semantics in SCS the two queries above are
not equivalent as (I think) all of us expect.

> SQL users do not expect to be able to permute
> JOIN and LEFT OUTER JOIN as shown between Q1 and Q2, because
> LEFT OUTER JOIN is known not to have such behavior.

exactly as they must not expect to be able to permute AND and OPTIONAL as
I show you above.

> I think that AND is an unfortunate name for
> this operator.

may be you are right.... we name it AND just to be consistent with what we
understand of the specification (the idea of graph matching), but we
actually define its semantics similarly to a JOIN in relational algebra.
(Just a note: the most important difference and the one that makes things
complex contrasting it with relational algebra, is that AND must be
defined as a *never null rejecting JOIN* wich is not the default case in
SQL. This can be seen in the extensive use of COALESCE and IS NULL in the
work of Fred and in the work by Richard Cyganiak
http://www.hpl.hp.com/techreports/2005/HPL-2005-170.html)

About the examples in Bijan's previous message, they are different to the
above queries, they are something like

P1': (?X :n :p)    AND     ( (?Y :n :g) OPTIONAL ( ?X :c ?Z) )
P2': ( (?Y :n :g) OPTIONAL ( ?X :c ?Z) )     AND    (?X :n :p)

which translated to SPARQL original syntax result in

Q1': { { ?X :n :p } . { ?Y :n :g OPTIONAL { ?X :c ?Z }} }
Q2': { { ?Y :n :g OPTIONAL { ?X :c ?Z }} . { ?X :n :p } }

The question then is if the above two queries must be equivalent or not.
Our compositional semantics makes them equivalent (I think that they are
also equivalent in Fred's translation to SQL). But the alternative
operational semantics that we present in SCS does not, and, as we report
in SCS, the two queries give different solutions in ARQ too (indeed we
define our operational semantics to follow closely ARQ's evaluation
algorithm).

I personally think that { A . B } must be equivalent to { B . A }, no
matter what A and B are. I also think that DAWG already agrees with this,
and it is clear in the definition of solutions for group graph patterns
(it is clear for me, am I right?). So, commutativity of AND (or JOIN, or
whatever is used to define semantics of groups) is not a big problematic
issue.

> SPARQL queries have the advantage over SQL in compactness,
> but I think the operator denoted by dot (.) or concatenation
> of patterns will mislead our users when they throw OPTIONAL
> into the mix.  Look at the efforts I have expended in trying
> to specify what the first operand of OPTIONAL is.
>
> Andy Seaborne's recent grammar changes
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006OctDec/0055.html
> are a step forward, but I think the real solution is to
> make OPTIONAL into a bona fide binary operator, with a production
> such as
>
> OptionalPattern ::= GroupGraphPattern OPTIONAL GroupGraphPattern
>
> The existing syntax as a postfix operator whose first argument
> must be inferred just does not provide the clarity of a true
> binary operator in the BNF.

I totally agree with you Fred. But first, DAWG must decide if OPTIONAL is
or is not a *binary operator*, I think that this is a really important
issue. In SCS our compositional semantics defines it as binary operator
following a left outer join (the same note for AND/JOIN applies here about
*never null
rejecting*), but in the operational semantics the evaluation is
navigational following the intention of Andy and the ARQ evaluation
algorithm for OPTIONALs. For details see the thread starting in
http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2006Apr/0000.html

In SCS we formalized the two approaches reporting good/bad characteristic
of the two evaluation methods. DAWG must make the desgin decision about
OPTIONALs, and this decision must be stated clearly in the specification.

Cheers,
- jorge (the Chilean)

Received on Saturday, 14 October 2006 23:10:52 UTC