W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > October 2006

Re: SPARQL OPTIONAL vs UNION treatment of sub-graph matching

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Fri, 27 Oct 2006 14:35:17 +0100
Message-ID: <45420B15.6080205@hp.com>
To: Patrick Shironoshita <Patrick.Shironoshita@infotechsoft.com>
CC: public-rdf-dawg-comments@w3.org, 'Mansur Kabuka' <kabuka@infotechsoft.com>

Patrick,

Thank you for the comments which I've forwarded to the working group.

Some semantics under discussion within the working group equate optional with 
left-outer-join:

http://lists.w3.org/Archives/Public/public-rdf-dawg/2006OctDec/0080.html
http://lists.w3.org/Archives/Public/public-rdf-dawg/2006OctDec/att-0102/sparql_semantics.pdf

How does this compare with the the semantics you're developing?

	Andy

Patrick Shironoshita wrote:
> 
> As part of our work in implementing ontology-based querying, we have 
> developed an algebra for SPARQL which we have recently submitted for 
> publication. During the development of this query algebra, we have found 
> that the treatment of graph matching in OPTIONAL and UNION graph 
> patterns is, in our opinion, inconsistent, in particular with respect of 
> the issue of sub-graph matching - that is, in the issue of whether 
> solutions to a graph pattern can be sub-graphs of other solutions.
>  
> Consider, for example, the following OPTIONAL query (from the SPARQL spec):
>  
> Data:
> 
>  @prefix foaf:       <http://xmlns.com/foaf/0.1/> .
> @prefix rdf:        <http://www.w3.org/1999/02/22-rdf-syntax-ns#> ..
> 
> _:a  rdf:type        foaf:Person .
> _:a  foaf:name       "Alice" .
> _:a  foaf:mbox       <mailto:alice@example.com> .
> _:a  foaf:mbox       <mailto:alice@work.example> .
> 
> _:b  rdf:type        foaf:Person .
> _:b  foaf:name       "Bob" . 
> 
>  OPTIONAL Query: 
> 
>  PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> SELECT ?name ?mbox
> WHERE  { ?x foaf:name  ?name .
>          OPTIONAL { ?x  foaf:mbox  ?mbox }
>        } 
> 
>   OPTIONAL query Result:     
> name 	mbox
> "Alice" 	<mailto:alice@example.com>
> "Alice" 	<mailto:alice@work.example>
> "Bob" 	(unbound)
> 
>  
> 
>  
> Now, consider, over the same data set, the following UNION query:
> 
>  UNION Query: 
> 
>  PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> SELECT ?name ?mbox
> WHERE  {    { ?x foaf:name  ?name }
>             UNION  
> 
>              { ?x foaf:name  ?name . ?x  foaf:mbox  ?mbox }
>        } 
> 
> UNION Query Result, as per (our understanding of the) current SPARQL 
> working draft:
>  
>  
> name 	mbox
> "Alice" 	(unbound)
> "Bob" 	(unbound)
> "Alice" 	<mailto:alice@example.com>
> "Alice" 	<mailto:alice@work.example>
> 
> where the first two rows match the first part of the UNION pattern, and 
> the second two rows match the second part.
> 
> As can be seen, the results from the OPTIONAL and UNION queries are 
> different only in that the UNION query allows a sub-graph of another 
> solution, while OPTIONAL explicitly does not. W hile there is no 
> requirement in SPARQL that the two queries presented above produce the 
> same results, we argue that the implementation of query processors and 
> optimizers for SPARQL would be made simpler if either OPTIONAL or UNION 
> is redefined so that both the queries above yield the same result - and, 
> therefore, so that OPTIONAL can be defined in terms of UNION as follows:
> 
> P1 OPTIONAL P2  = P1 UNION { P1 . P2}
> 
> where both P1 and P2 are graph patterns, and where P1 . P2 is the 
> conjunction of P1 and P2.
> 
> In our opinion, the better solution would be to require that UNION graph 
> patterns state the requirement that sub-graphs of other solutions for 
> the graph pattern should be eliminated (much in the same way as OPTIONAL 
> states this requirement). As can be seen from above, the row that 
> matches "Alice" to ?name and leaves ?mbox unbound is at the very least 
> useless, and potentially dangerous (implying, as it does, that there is 
> no mailbox for "Alice").
> 
> Alternatively, both UNION and OPTIONAL could keep sub-graphs of other 
> graphs that match the pattern. In this second case, we would strongly 
> suggest that either a solution modifier or a graph pattern modifier be 
> introduced so that sub-graphs get eliminated from the final result at 
> the query writer's option.
> 
> Patrick Shironoshita
> Ph: +1(305)670-5111 ext. 11
> Fax: +1(305)670-7722
> www.infotechsoft.com
> 
>  
Received on Friday, 27 October 2006 13:35:57 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:14:50 GMT