- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Wed, 23 Mar 2005 14:29:54 +0000
- To: Geoff Chappell <geoff@sover.net>
- CC: public-rdf-dawg-comments@w3.org
Geoff Chappell wrote:
>
>
>>-----Original Message-----
>>From: Arjohn Kampman [mailto:arjohn.kampman@aduna.biz]
>>Sent: Wednesday, March 23, 2005 5:52 AM
>>To: Geoff Chappell
>>Cc: public-rdf-dawg-comments@w3.org
>>Subject: Re: Unbound vars as blank nodes
>>
>
> [...]
>
>>>SELECT ?x ?y
>>> WHERE { ?book dc10:title ?x }
>>>
>>>Logically speaking projection vars are existentially quantified, right?
>>
>>And
>>
>>>that's what a blank node is, so it seems logically correct to return:
>>>
>>> ?x="Moby Dick", ?y=_:b1.
>>>
>>>I.e. the sentence:
>>> There exists ?x,?y such that ?x is the title of something
>>>essentially becomes:
>>> There exists ?y such that "Moby Dick" is the title of BK1
>>
>>Yikes! Apart from the fact that the above query should be flagged as
>>illegal (see my previous posting to this list), generating new bnodes
>>for unbound variables will make the QL even more complex than it already
>>is. Developers have learned to live with NULL values in the context of
>>SQL, so why would this be problematic for SPARQL?
>
>
> I'm not sure I buy the complexity argument... do you mean complex for the
> implementor or complex for the user? Either way, it doesn't strike me as too
> much of a burden. And I think the SQL/SPARQL analogies only get you so far.
> E.g. wrt to this issue, RDF has a built-in way to represent variables in
> results, SQL doesn't. Plus, NULLs carry their own load of controversy and
> confusion in the SQL world.
>
> That's not to say NULLs won't work. I think a perfectly workable solution is
> to require that all vars mentioned in a pattern element are bound to
> something by that pattern element -- if not to an actual value, then to NULL
> -- and that NULL != NULL. IMHO that would resolve the current execution
> ordering mess (I've heard statements to the contrary but I've never seen a
> counter example).
Geoff,
I extracted this from our previous discussion: could you check I've got the
example right?
Data::
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
_:a foaf:name "Alice" .
_:b foaf:name "Bob" .
_:b foaf:mbox <mailto:bob@work.example> .
_:c foaf:mbox <mailto:noname@work.example.org> .
Query::
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?mbox
WHERE
{ ?x foaf:name ?name .
OPTIONAL { ?x foaf:mbox ?mbox } .
?y foaf:mbox ?mbox .
}
The first two lines of the pattern on their own give:
---------------------------------------
| name | mbox |
=======================================
| "Alice" | | <<---- ????
| "Bob" | <mailto:bob@work.example> |
---------------------------------------
If ?mbox in the first solution is NULL (whether NULL = NULL or NULL != NULL)
then the third pattern
{ ?y foaf:mbox ?mbox }
does not match either of the triples
_:c foaf:mbox <mailto:noname@work.example.org>
_:b foaf:mbox <mailto:bob@work.example>
because ?mbox is bound to NULL (unless matching for NULL is special - I'm
treating just as an assignable value which might be my mistake). I find this
strange because the first and third triple patterns are independent.
The other partial solution matches
_:b foaf:mbox <mailto:bob@work.example>
so there is one solution
---------------------------------------
| name | mbox |
=======================================
| "Bob" | <mailto:bob@work.example> |
---------------------------------------
If ?mbox is unbound then
?name = "Alice" ?mbox = <mailto:noname@work.example.org>
and
?name = "Alice" ?mbox = <mailto:bob@work.example.org>
are solutions.
----------------------------------------------
| name | mbox |
==============================================
| "Alice" | <mailto:bob@work.example> |
| "Alice" | <mailto:noname@work.example.org> |
| "Bob" | <mailto:bob@work.example> |
----------------------------------------------
Reversing the lines:
?x foaf:name ?name .
?y foaf:mbox ?mbox .
OPTIONAL { ?x foaf:mbox ?mbox } .
after the first two (independent : cross product)
----------------------------------------------
| name | mbox |
==============================================
| "Bob" | <mailto:noname@work.example.org> |
| "Bob" | <mailto:bob@work.example> |
| "Alice" | <mailto:noname@work.example.org> |
| "Alice" | <mailto:bob@work.example> |
----------------------------------------------
and the OPTIONAL does nothing (for either NULL or unbound models).
What we have to do is find a way of saying is "do inner joins first - don't have
two variables in optionals without being in fixed pattern". This can be via
syntactic restriction (which may remove some OK queries as well) (and it needs a
non-synatctic rule about variable usage acorss optional c.f. outer joins) or a
general restriction on the query structure.
A nearby issue arises for:
?v < 3 .
?x :p ?v .
SQL has a clear syntactic distinction but it forces more separation than
necessary and does not address:
?v math:lessThan 3 .
?x :p ?v .
(I'm ignoring the subjects-as-literals issue).
It is obvious what the application intended but a system can't naively ignore order.
My other worry about a syntactic restriction is more about large queries.
Forcing a gap means that the application writer has to associate one part of the
query with another - like not putting the condition on a variable next to the
variable.
Andy
ref:
http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Feb/0012.html
> The current approach to specifying preferred execution
> orders is just too fragile. It will be a major obstruction to future
> versions of the language - e.g. good luck using sparql (squint and construct
> looks like a rule construction) as any sort of a rule language with all of
> these ordering dependencies built-in.
>
>
>>[...]
>>
>>>Now for optionals....
>>>
>>>SELECT ?x ?y
>>> WHERE { ?book dc10:title ?x. OPTIONAL ?book ex:author ?y.}
>>>
>>>The result:
>>>
>>> ?x="Moby Dick" ?y=_:b1
>>>
>>>seems reasonable - i.e. we know the book has an author (that's what
>>
>>we've
>>
>>>implied by using optional) we just don't know what it is.
>>
>>This is not true: the optional implies that the book can have an author,
>>not that it actually has one. From a developer POV, it's important to
>>make this distinction. Returning bnodes for unbound variables suggests
>>that it actually was bound.
>
>
> Well, I guess I'd say that optional implies whatever optional is specified
> to imply. But I'll agree it's a weakness of this approach that a user
> couldn't necessarily distinguish between a "real" and an "artificial" value.
>
>
> - Geoff
>
>
>
>
>
Received on Wednesday, 23 March 2005 14:57:24 UTC