- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Wed, 23 Mar 2005 14:29:54 +0000
- To: Geoff Chappell <geoff@sover.net>
- CC: public-rdf-dawg-comments@w3.org
Geoff Chappell wrote: > > >>-----Original Message----- >>From: Arjohn Kampman [mailto:arjohn.kampman@aduna.biz] >>Sent: Wednesday, March 23, 2005 5:52 AM >>To: Geoff Chappell >>Cc: public-rdf-dawg-comments@w3.org >>Subject: Re: Unbound vars as blank nodes >> > > [...] > >>>SELECT ?x ?y >>> WHERE { ?book dc10:title ?x } >>> >>>Logically speaking projection vars are existentially quantified, right? >> >>And >> >>>that's what a blank node is, so it seems logically correct to return: >>> >>> ?x="Moby Dick", ?y=_:b1. >>> >>>I.e. the sentence: >>> There exists ?x,?y such that ?x is the title of something >>>essentially becomes: >>> There exists ?y such that "Moby Dick" is the title of BK1 >> >>Yikes! Apart from the fact that the above query should be flagged as >>illegal (see my previous posting to this list), generating new bnodes >>for unbound variables will make the QL even more complex than it already >>is. Developers have learned to live with NULL values in the context of >>SQL, so why would this be problematic for SPARQL? > > > I'm not sure I buy the complexity argument... do you mean complex for the > implementor or complex for the user? Either way, it doesn't strike me as too > much of a burden. And I think the SQL/SPARQL analogies only get you so far. > E.g. wrt to this issue, RDF has a built-in way to represent variables in > results, SQL doesn't. Plus, NULLs carry their own load of controversy and > confusion in the SQL world. > > That's not to say NULLs won't work. I think a perfectly workable solution is > to require that all vars mentioned in a pattern element are bound to > something by that pattern element -- if not to an actual value, then to NULL > -- and that NULL != NULL. IMHO that would resolve the current execution > ordering mess (I've heard statements to the contrary but I've never seen a > counter example). Geoff, I extracted this from our previous discussion: could you check I've got the example right? Data:: @prefix foaf: <http://xmlns.com/foaf/0.1/> . _:a foaf:name "Alice" . _:b foaf:name "Bob" . _:b foaf:mbox <mailto:bob@work.example> . _:c foaf:mbox <mailto:noname@work.example.org> . Query:: PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?mbox WHERE { ?x foaf:name ?name . OPTIONAL { ?x foaf:mbox ?mbox } . ?y foaf:mbox ?mbox . } The first two lines of the pattern on their own give: --------------------------------------- | name | mbox | ======================================= | "Alice" | | <<---- ???? | "Bob" | <mailto:bob@work.example> | --------------------------------------- If ?mbox in the first solution is NULL (whether NULL = NULL or NULL != NULL) then the third pattern { ?y foaf:mbox ?mbox } does not match either of the triples _:c foaf:mbox <mailto:noname@work.example.org> _:b foaf:mbox <mailto:bob@work.example> because ?mbox is bound to NULL (unless matching for NULL is special - I'm treating just as an assignable value which might be my mistake). I find this strange because the first and third triple patterns are independent. The other partial solution matches _:b foaf:mbox <mailto:bob@work.example> so there is one solution --------------------------------------- | name | mbox | ======================================= | "Bob" | <mailto:bob@work.example> | --------------------------------------- If ?mbox is unbound then ?name = "Alice" ?mbox = <mailto:noname@work.example.org> and ?name = "Alice" ?mbox = <mailto:bob@work.example.org> are solutions. ---------------------------------------------- | name | mbox | ============================================== | "Alice" | <mailto:bob@work.example> | | "Alice" | <mailto:noname@work.example.org> | | "Bob" | <mailto:bob@work.example> | ---------------------------------------------- Reversing the lines: ?x foaf:name ?name . ?y foaf:mbox ?mbox . OPTIONAL { ?x foaf:mbox ?mbox } . after the first two (independent : cross product) ---------------------------------------------- | name | mbox | ============================================== | "Bob" | <mailto:noname@work.example.org> | | "Bob" | <mailto:bob@work.example> | | "Alice" | <mailto:noname@work.example.org> | | "Alice" | <mailto:bob@work.example> | ---------------------------------------------- and the OPTIONAL does nothing (for either NULL or unbound models). What we have to do is find a way of saying is "do inner joins first - don't have two variables in optionals without being in fixed pattern". This can be via syntactic restriction (which may remove some OK queries as well) (and it needs a non-synatctic rule about variable usage acorss optional c.f. outer joins) or a general restriction on the query structure. A nearby issue arises for: ?v < 3 . ?x :p ?v . SQL has a clear syntactic distinction but it forces more separation than necessary and does not address: ?v math:lessThan 3 . ?x :p ?v . (I'm ignoring the subjects-as-literals issue). It is obvious what the application intended but a system can't naively ignore order. My other worry about a syntactic restriction is more about large queries. Forcing a gap means that the application writer has to associate one part of the query with another - like not putting the condition on a variable next to the variable. Andy ref: http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Feb/0012.html > The current approach to specifying preferred execution > orders is just too fragile. It will be a major obstruction to future > versions of the language - e.g. good luck using sparql (squint and construct > looks like a rule construction) as any sort of a rule language with all of > these ordering dependencies built-in. > > >>[...] >> >>>Now for optionals.... >>> >>>SELECT ?x ?y >>> WHERE { ?book dc10:title ?x. OPTIONAL ?book ex:author ?y.} >>> >>>The result: >>> >>> ?x="Moby Dick" ?y=_:b1 >>> >>>seems reasonable - i.e. we know the book has an author (that's what >> >>we've >> >>>implied by using optional) we just don't know what it is. >> >>This is not true: the optional implies that the book can have an author, >>not that it actually has one. From a developer POV, it's important to >>make this distinction. Returning bnodes for unbound variables suggests >>that it actually was bound. > > > Well, I guess I'd say that optional implies whatever optional is specified > to imply. But I'll agree it's a weakness of this approach that a user > couldn't necessarily distinguish between a "real" and an "artificial" value. > > > - Geoff > > > > >
Received on Wednesday, 23 March 2005 14:57:24 UTC