W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > February 2005

Re: Questions about OPTIONAL

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Mon, 28 Feb 2005 11:06:17 +0000
Message-ID: <4222FB29.30304@hp.com>
To: Geoff Chappell <geoff@sover.net>
Cc: public-rdf-dawg-comments@w3.org



Geoff Chappell wrote:
> 
>>-----Original Message-----
>>From: Seaborne, Andy [mailto:andy.seaborne@hp.com]
>>Sent: Friday, February 25, 2005 6:54 AM
>>To: Geoff Chappell
>>Cc: public-rdf-dawg-comments@w3.org
>>Subject: Re: Questions about OPTIONAL
>>
> 
> [...]
> 
>>Just to be quite sure here - by "triple", you mean the triple pattern?
>>Not a
>>triple from the data.  That reading seems consistent with your examples
>>below.
> 
> 
> Yes.
>  
> [...]
> 
>>Optional can't reduce the number of input solutions, though it can
>>increase them
> 
> 
> Are you saying this is part of the definition of OPTIONAL or are you
> describing its behavior in your implementation?

It comes from the definition of OPTIONAL.

> 
> If the former is the case, I think you'll have a hard time coming up with
> order-independent semantics for OPTIONAL.

I agree - I don't think the user task of adding in some extra information 
but not failing the rest of the match can be met in an order-independent 
without extra (to the user) solutions being generated (the form where 
OPTIONAL always passes through its intial input) or some form of 
distinguished value, defaulting matches to NULL and NULL matching NULL (not 
true for SQL).  In the example we have been discussing, your 
NULL(non-rebinding) form is the same as removing the optional because the 
variable ?mbox is used after the optional.

It's a trade-off - making the user task most natural vs design purity.

 > I imagine the only choice you'll
> have is to require an evaluation order such as Jeen Broekstra described
> where all OPTIONALs are evaluated last.

This is certainly one appraoch - to define the correct order and let 
implementations either (loosely) canonicalise a query or reject a query 
which viloates that order rule.  It could be loosen to work in terms of 
variables used.

Informally, if a variable can be bound by a fixed patten and an optional 
pattern, it must be done so that the fixed pattern is done first.

This would extend to:

WHERE AND ?o < 5 (?s ?p ?o)

which, if blindly executed with "?o < 5" first is unlikely to be what was meant.

I note that some systems (e.g. cwm) already does something like this in its 
evaluation.

> The biggest problem I see with that
> approach is that certain queries just won't work - e.g.:
> 
> 	(?x ?p ?a) optional(?x ?y ?z) optional (?a ?b ?y)

Indeed - this is already noted in the spec.

> 
> Not to mention that you've strayed from a purely declarative model which
> among other things makes it more difficult to map on to other systems (for
> example, I can no longer do a simple sparql->rdfql rewrite for rdf gateway.
> Instead, I'll have to build a separate sparql query parser and planner
> rather than using the existing execution planner).

Noted - also noted is that SQL queries can have order dependences.  The 
correct order is then what the application programmar wrote.

> 
> -Geoff
> 
	Andy
Received on Monday, 28 February 2005 11:06:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:14:47 GMT