W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2006

Re: Issues with evaluating optional: Commutativity of AND

From: Bijan Parsia <bparsia@cs.man.ac.uk>
Date: Sat, 14 Oct 2006 11:22:48 +0100
Message-Id: <5EAEC391-A584-41C1-8B7E-C608A022ADD8@cs.man.ac.uk>
Cc: public-rdf-dawg@w3.org
To: Fred Zemke <fred.zemke@oracle.com>

On Oct 13, 2006, at 10:41 PM, Fred Zemke wrote:

> Bijan Parsia wrote:
>
>> This may correspond with some of Fred's issues and I apologize  
>> for  not reconciling with his texts and email.
>
> No problem, I have trouble keeping up with all the email myself.
>
> I am not replicating Bijan's message so that I can get to what
> his message means to me.
>
> His message mentions various operators, with uppercase names,
> among which is AND.  I think that AND is an unfortunate name for
> this operator.

I'm following SCS.

> Bijan also uses the term "join" in lowercase, which
> I think is preferable.

AND is defined to be a(n inner) join in SCS, and that's how I'm using  
it. Inner joins are commutative as well. As I understand it, to make  
it not do so requires not just the presences of blanks, but this  
"current evaluation" table around, i.e., so that the evaluatoin is  
explicitly backward looking.

[snip in a bit of a hurry]
> Q1: WHERE { ?X :n :p . ?Y :n :g OPTIONAL { ?X :c ?Z } }
> Q2: WHERE { ?Y :n :g OPTIONAL { ?X :e ?Z } ?X :n :p }
>
> These involve three triple patterns:
>
> T1: ?X :n :p
> T2: ?Y :n :g
> T3: ?X :c ?Z
>
> If we think of the connective in the group graph pattern as AND,
> then one might expect that Q1 is equivalent to Q2.  All the user
> has done is permute T1 from the front of the query to the back.
> However, if we think of the connective
> as JOIN, then the equivalence falls apart, at least from my
> experience in SQL.  SQL users do not expect to be able to permute
> JOIN and LEFT OUTER JOIN as shown between Q1 and Q2, because
> LEFT OUTER JOIN is known not to have such behavior.

I'm no SQL expert or at all experienced with it, but I don't see why  
this is the case. The top level operator is
the inner join. Inner joins commute. Do you mean that SQL JOIN isn't  
a inner join?

I will find that surprising should I ever encounter it :)

> This point is seen by translating Q1 and Q2 into SQL.
> I will use a simplification of my translation algorithm attached to
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006OctDec/ 
> 0058.html
> Let Graph be the table containing the default graph, with columns
> called Subject, Verb and Object.  The three triples can be
> expressed by
>
> WITH T1 AS ( SELECT Subject AS X
>             FROM Graph
>             WHERE Verb = ':n' AND Object = ':p' ),
>     T2 AS ( SELECT Subject AS Y
>             FROM Graph
>             WHERE Verb = ':n' AND Object = ':g' ),
>     T3 AS ( SELECT Subject AS X, Object AS Z
>             FROM Graph
>             WHERE Verb = ':c' )

So far, great.

> then the translation of Q1 will finish as follows:
>
> SELECT T1.X, T2.Y, T3.Z
> FROM (T1 JOIN T2 ON 1=1) LEFT OUTER JOIN T3 ON (T1.X = T3.X)
>
> and the translation of Q2 will finish with:
>
> SELECT T1.Y, T2.X, T2.Z
> FROM (T2 LEFT OUTER JOIN T3 ON (1=1)) JOIN T1 ON (T1.X = T3.X)
>
> and nobody expects the algebra of JOIN and LEFT OUTER JOIN
> to make such an equivalentce

Could you please send me a pointer to where this is defined? Just  
looking around, I couldn't find it. Tutorials like:
	<http://www.databasejournal.com/features/oracle/article.php/3527921>

don't warn against it (natural join, inner join, and join are taken  
as synonyms).

> So, as I said, changing our operator name from AND to JOIN
> is a step in the right direction.  That will bring us some
> clarity, but what about our users?  Let's look at the
> queries again:
>
> Q1: WHERE { ?X :n :p . ?Y :n :g OPTIONAL { ?X :c ?Z } }
> Q2: WHERE { ?Y :n :g OPTIONAL { ?X :e ?Z } ?X :n :p }
>
> SPARQL queries have the advantage over SQL in compactness,
> but I think the operator denoted by dot (.) or concatenation
> of patterns will mislead our users when they throw OPTIONAL
> into the mix.  Look at the efforts I have expended in trying
> to specify what the first operand of OPTIONAL is.
> Andy Seaborne's recent grammar changes
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006OctDec/ 
> 0055.html
> are a step forward, but I think the real solution is to
> make OPTIONAL into a bona fide binary operator, with a production
> such as
>
> OptionalPattern ::= GroupGraphPattern OPTIONAL GroupGraphPattern
>
> The existing syntax as a postfix operator whose first argument
> must be inferred just does not provide the clarity of a true
> binary operator in the BNF.
>
> If we did this, then the two queries would be written
>
> Q1: WHERE { { ?X :n :p . ?Y :n :g } OPTIONAL { ?X :c ?Z } }
> Q2: WHERE { { ?Y :n :g } OPTIONAL { ?X :e ?Z } ?X :n :p }

Except that's not the transformation in my examples. I agree that  
such distribution is weird. But, using your syntax, wouldn't you expect:

Q1': WHERE { { ?X :n :p . [[[[{?Y :n :g } OPTIONAL { ?X :c ?Z }]]] }
Q2': WHERE { [[[{ ?Y :n :g } OPTIONAL { ?X :e ?Z }]]] ?X :n :p }

(extra brackets for clarity) to be equivalent? Your permutation seems  
to change the arguments to the optional. Am I right to read that  
first argument to optional in Q1 is "{ ?X :n :p . ?Y :n :g }" and in  
the second "{ ?Y :n :g }"???

> The revised SPARQL syntax clearly indicates the desired join order,

Not yet to me, since I don't understand your transformation.

> and I don't think our users will think that they can move
> T1 out of the first operand in Q1 as shown in Q2.

Of course not, but that's not the example! We're talking about  
different things.

Under the "navigational" eval strategy, Q1' and Q2' give *different  
answers* even though you've not *changed* the operands, merely  
commuted them!

I take this as actually giving support for the SCS position.

> I would welcome comments from the Chileans, since their
> work sparked this thread.
>
> While I am bitching about the SPARQL syntax, I will add
> that the SELECT ... FROM ... WHERE ... syntax, borrowed
> from SQL, is also misleading.  In SQL, rows are built
> in the FROM clause and pruned in the WHERE clause.  In SPARQL,
> solutions are built in the WHERE clause, and pruned
> in the FILTER clause.  I think this difference will be a source
> of ongoing confusion as users, many of whom will have an SQL
> background, encounter SPARQL.  I would be happy if we changed
> the keywords, especially WHERE, to something else. Maybe change  
> WHERE to MATCH.

I just care that the definitions of what they are are clear and that  
reasonable transformations (e.g., proposition 1) are preserved.

Cheers,
Bijan.
Received on Saturday, 14 October 2006 10:29:40 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:27 GMT