Re: Questions about OPTIONAL

Geoff Chappell wrote:
> A few questions about OPTIONAL...
> 
> 1) If I had a query like this:
> 
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> SELECT ?x ?name ?y
> WHERE  ( ?x foaf:name  ?name )
>        OPTIONAL ( ?x foaf:mbox ?mbox )
>        ( ?y foaf:mbox ?mbox )
> 
> with data:
> 
> @prefix foaf:       <http://xmlns.com/foaf/0.1/> .
> 
> _:a  foaf:name       "Alice" .
> _:b  foaf:name       "Bob" .
> _:b  foaf:mbox       <mailto:bob@work.example> .
> _:c  foaf:mbox       <mailto:noname@work.example.org> .
> 
> 
> should I get:
> 
> x   name    y
> === ======= ===
> _:b "Bob"   _:b
> 
> or:
> 
> x   name    y
> === ======= ===
> _:a "Alice" _:b   
> _:a "Alice" _:c   
> _:b "Bob"   _:b 

Geoof - thank you very much for including a concrete example:

The current working draft is inadequate in the treatment of order of execution
implications - this is something that has to be done.


Executed purely in the order given, I tried your example and get:

-------------------------
| x    | name    | y    |
=========================
| _:b0 | "Alice" | _:b1 |
| _:b0 | "Alice" | _:b2 |
| _:b2 | "Bob"   | _:b2 |
-------------------------

but as:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
# Add ?mbox to select for clarity
SELECT ?x ?name ?y ?mbox
WHERE  ( ?x foaf:name  ?name )
# Reverse these next two lines.
         ( ?y foaf:mbox ?mbox )
         OPTIONAL ( ?x foaf:mbox ?mbox )


I get:

------------------------------------------------------------
| x    | name    | y    | mbox                             |
============================================================
| _:b0 | "Bob"   | _:b0 | <mailto:bob@work.example>        |
| _:b0 | "Bob"   | _:b1 | <mailto:noname@work.example.org> |
| _:b2 | "Alice" | _:b0 | <mailto:bob@work.example>        |
| _:b2 | "Alice" | _:b1 | <mailto:noname@work.example.org> |
------------------------------------------------------------

because the first two triple patterns are unconnected so the number of results
is 2 * 2 (each matches twice), and the optional adds nothing because ?x and
?mbox were already defined by earlier patterns.


> 
> In other words, is a variable bound to a value such as NULL by an optional
> when there it is not otherwise bound by the block's conditions or does it
> truly escape unbound? If it's bound to a NULL, does NULL==NULL?
> 
> 2) In section 5.3 you say: "If a new variable is mentioned in an optional
> block (as mbox and hpage are mentioned in the previous example), that
> variable can be mentioned in that block and can not be mentioned in a
> subsequent block." 
> 
> 	a. Does a subsequent _block_ refer to a subsequent _OPTIONAL block_
> or does block refer to any logical factor (e.g. the triple after the
> optional in my previous example)? If the latter, I think you need to make
> that clearer (i.e. define block somewhere).
> 
> 	b. Does this mean subsequent relative to the order the query was
> written or relative to a possible re-ordering by a query processor? Similar
> question for section 5.6 - is the shape of the tree determined by the order
> as written or the processing order?
> 
> 3) I'll note that in my opinion this would all be clearer if you spelled out
> that A and OPTIONAL B is just shorthand for A and (B or not B) -- or for A
> and (B or (not B and v1..n=NULL)). That would lay a firmer foundation for
> considering questions such as: "Is OPTIONAL B and A the same as A and
> OPTIONAL B?" and generally present the concept in a language people are more
> familiar with.

Having OPTIONAL always pass the "no match" case even when there are other
matched results in a stable execution order but it results in extra solutions
that add nothing except complications for applications.

For example:

@prefix foaf:       <http://xmlns.com/foaf/0.1/> .

_:a  foaf:name       "Alice" .
_:a  foaf:mbox       <mailto:alice@work.example> .

Query:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?mbox
WHERE
      ( ?x foaf:mbox ?mbox )
      OPTIONAL ( ?x foaf:name  ?name )


gives:

-----------------------------------------
| name    | mbox                        |
=========================================
| "Alice" | <mailto:alice@work.example> |
-----------------------------------------


or:

-----------------------------------------
| name    | mbox                        |
=========================================
|         | <mailto:alice@work.example> |
| "Alice" | <mailto:alice@work.example> |
-----------------------------------------

Given the idea behind optional of "add extra information to a solution if
available, but don;t fail if not there"

In simple cases, it may be possible to filter the output to remove this
redundancy but in more complicated queries (for example, ?name is used elsewhere
as well, multiple optionals, sharing variables), I didn't manage to find a way
that kept the streaming requirement for results.

If the application is displaying information for people, then getting two
answers back for what is one person is a less useful paradigm.  It is a tradeoff
of convenience.

I did find that (same results as the last example):

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?mbox
WHERE
      ( ?x foaf:mbox ?mbox )
      { {} UNION ( ?x foaf:name  ?name ) }

is illegal in the current grammar (a form of "A and (B or not B)" for
optionals). Maybe it shouldn't be.  Thoughts?

> 
> -Geoff

	Thanks for the example and commentary,
	Andy

Received on Wednesday, 23 February 2005 15:32:06 UTC