W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > January to March 2005

Re: Optionals and execution order

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Fri, 04 Feb 2005 22:12:08 +0000
Message-ID: <4203F338.4080608@hp.com>
To: RDF Data Access Working Group <public-rdf-dawg@w3.org>

A couple of people have said they found the formatting of the query results 
confusing.  I have redone them as text tables and inserted the equivalent 
inline below in case that helps.

	Andy

Seaborne, Andy wrote:
> 
> Re: ACTION AndyS: to propose test case re optionals and ordering
> 
> Before I get to test cases, I'd like to describe the problem.
> 
> 
> ==== Data:
> 
> @prefix :        <http://example.org/ns#> .
> 
> :z    :q          "z-q-1" .
> 
> :x    :p          "x-p-1" ;
>        :q          "x-q-1" ;
>        :r          "x-r-1" .
> 
> :y    :p          "y-p-1" .
> 
> == Query 1:
> 
> ---------------------------
> 
> PREFIX  : <http://example.org/ns#>
> SELECT ?a ?v ?w
> WHERE
>     ( ?a :p ?v )
>     OPTIONAL ( ?a :q ?w )
> 
>    1 ( ?a = :y) ( ?v = "y-p-1")
>    2 ( ?a = :x) ( ?v = "x-p-1") ( ?w = "x-q-1")

That is two solutions, one per line.  As a text table:

--------------------------
| a  | v       | w       |
==========================
| :y | "y-p-1" |         |
| :x | "x-p-1" | "x-q-1" |
--------------------------

A blank cell entry means the variable is not bound in that solution.


> 
> The first triple pattern has solutions
>   ( ?a = :y) ( ?v = "y-p-1")
>   ( ?a = :x) ( ?v = "x-p-1")

Executing just the first triple pattern gets:

PREFIX  : <http://example.org/ns#>
SELECT ?a ?v ?w
WHERE
     ( ?a :p ?v )


-----------------------
| a  | v       | w    |
=======================
| :y | "y-p-1" |      |
| :x | "x-p-1" |      |
-----------------------

The blank column for "w" is simply because it appears in the SELECT but not 
in the query pattern.

> and the optional adds  ( ?w = "x-q-1") to the second.

It adds the entry in the "w" column.

> 
> 
> Optional, but also including passing through the input solution of
> from the first triple pattern:
> 
>    1 ( ?a = :y) ( ?v = "y-p-1")
>    2 ( ?a = :x) ( ?v = "x-p-1")
>    3 ( ?a = :x) ( ?v = "x-p-1") ( ?w = "x-q-1")
> 
> adding the solution at line 2 which is a subset of line 3.

--------------------------
| a  | v       | w       |
==========================
| :y | "y-p-1" |         |
| :x | "x-p-1" |         |
| :x | "x-p-1" | "x-q-1" |
--------------------------

> 
> ---------------------------
> 
> == Query 2:
> As Query 1 but with the lines in the WHERE clause reversed.
> 
> PREFIX  : <http://example.org/ns#>
> SELECT ?a ?v ?w
> WHERE
>      OPTIONAL ( ?a :q ?w )
>      ( ?a :p ?v )
> 
> Optional, not passing through input if there is a match:
> Different to query 1:
> 
>    1 ( ?a = :x) ( ?v = "x-p-1") ( ?w = "x-q-1")

--------------------------
| a  | v       | w       |
==========================
| :x | "x-p-1" | "x-q-1" |
--------------------------

> 
> The first pattern, the optional, creates two partial solutions:
>     ( ?a = :z) ( ?w = "z-q-1")
>     ( ?a = :x) ( ?w = "x-q-1")
> the first is masked by the base pattern following.  It's the same as
> if there were no optional.

The query with just the optional expression:

PREFIX  : <http://example.org/ns#>
SELECT ?a ?v ?w
WHERE
     OPTIONAL ( ?a :q ?w )

-----------------------
| a  | v    | w       |
=======================
| :z |      | "z-q-1" |
| :x |      | "x-q-1" |
-----------------------

A blank column because ?v is metioned in the SELECT but not used in query 
pattern.

> 
> 
> Including passing through the input solutions we get the same as query 1 
> when it
> passed through the input solutions:
> 
>    1 ( ?a = :y) ( ?v = "y-p-1")
>    2 ( ?a = :x) ( ?v = "x-p-1")
>    3 ( ?a = :x) ( ?v = "x-p-1") ( ?w = "x-q-1")
> 
> Again, solution 2 is a subset of solution 3.

--------------------------
| a  | v       | w       |
==========================
| :y | "y-p-1" |         |
| :x | "x-p-1" |         |
| :x | "x-p-1" | "x-q-1" |
--------------------------

> 
> --------
> 
> Call the form of optional including passing through the initial solution
> "stable" as it always gets the same answer regardless of order.
> 
> But there are always "unhelpful" solutions.  For any optional, there is the
> original and extended solutions when the intention was more to be adding in
> additional information if available (the "useful" form).
> 
> 
> I have not found an algorithm to filter the results of the "stable" 
> optional form to get the "useful" form which does not involve sorting 
> the result set in some way.  Even then, I am not convinced that it is 
> right to do it by direct sorting.
> 
> I consider a requirement to sort as undesirable as it breaks streaming 
> and is hard to do for large results (larger than available main memory).
> 
> A way to resolve this is to generalise the optional variable rule to say 
> that
> the execution order must be as if any variables that can be bound by a 
> fixed
> pattern are done before an optional they are used in.  Another way of 
> saying this: if a variable is in an optional, it must not have been used 
> in later pattern if it had not been used in an earlier one.  That makes 
> two optionals, sharing a variable illegal.
> 
> 
> Much the same goes for constraints:
> 
>     AND ?x < 3
>     (?a :p ?x)
> compared to:
>     AND ?x < 3
>     (?a :p ?x)
> 
> 
> That's an informal description - I'm working on a formal definition.
> 
> There is a decision point as to whether the syntactic query is "wrong" 
> if variables are out of order or whether the query is "right" and the 
> execution engine has to execute in an appropriate order, but exactly the 
> syntactic query order.
> 
>     Andy
> 
> 
> 
Received on Friday, 4 February 2005 22:12:48 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:22 GMT