OPTIONALs and shared variables

In the telecon yesterday I mentioned that allowing shared variables  
introduced in OPTIONAL (and UNION) blocks prevented you from  
implementing it in a conventional relational algebra engine. Example:

:a dc10:title "thing" .
:a dc10:creator "John Smith" .
:a dc:11:creator _:jb .
_:jb rdf:value "Joe Bloggs" .

SELECT ?name
WHERE {
   ?a dc10:title "thing" .
   OPTIONAL { ?a dc10:creator ?name }
   OPTIONAL { ?a dc11:creator ?c .
              ?c rdf:value ?name }
}

if there are no shared variables you can map this to:
(excuse syntax, ASCII RA is difficult and it's been a while)

1)  T SELECT pred=dc10:title AND obj=thing RENAME subj,a
2)  LJOIN T a=subj RENAME obj,name
3)  LJOIN T a=subj RENAME obj,c
4)  LJOIN T c=subj AND obj=name

where T is the table of triples, but if you do that you get:

1)  a    c     name
     :a   NULL  NULL

2)  a    c     name
     :a   NULL  "John Smith"

3)  a    c     name
     :a   NULL  "John Smith"
     :a   :_jb  NULL

4)  a    c     name
     :a   NULL  "John Smith"
     # obj = "John Smith" and obj = NULL both fail

What you have to do (as Souri? pointed out) is namespace the values  
according to where they're bound, which is tricky, steps outside  
relational algebra and makes SPARQL significantly harder to implement  
that necessary.

In this simple example it is probably possible to implement it with  
sub-projections or variable mapping and using a big RENAME on the  
end, but in the general case, it's not.

My impression is that this scoped shared variable thing is mostly a  
hack to get round the fact that we don't have SQLs COALESCE() and  
SELECT expressions, but I think it's a bad choice to make the core  
language much more complex to get round it.

The previous form of:

SELECT ?name1 ?name2
WHERE {
   ?a dc10:title "thing" .
   OPTIONAL { ?a dc10:creator ?name1 }
   OPTIONAL { ?a dc11:creator ?c .
              ?c rdf:value ?name2 }
}

didn't have these problems of course, and as a bonus cures the order  
dependency + exponential complexity problem Eric described. The only  
downside being that the end user had to do the coalescing themselves.

- Steve

Received on Thursday, 8 June 2006 10:12:47 UTC