- From: Bob MacGregor <bmacgregor@siderean.com>
- Date: Thu, 24 Mar 2005 00:08:57 -0800
- To: andy.seaborne@hp.com
- CC: public-rdf-dawg-comments@w3.org, Pat Hayes <phayes@ihmc.us>
Andy, Thanks for working out a detailed example. It nicely illustrates how the construct I'm proposing for optional works, yielding exactly the solution that you deem correct. Possibly adding in a one-to-many relationship would make it even clearer to some folks, but this should suffice ... Seaborne, Andy wrote: > > RDF is semi-structured. The use task that motivates OPTIONAL is > accessing an RDF graph where some of the data is missing. Experience > has shown that this is a real issue for application writers. I agree completely. That why our implementation of an RDF query language has OPTIONAL (as well as conventional disjunction). > > Example: > > "Get the FOAF name and nick" from a database of people which uses > FOAF. foaf:name and foaf:nick do not necessarily exist for each > person. Assuming the well-formedness of the FOAF RDF there is an mbox > (which may be unknown - we'll assume it is in the graph for now). > > -------- > @prefix foaf: <http://xmlns.com/foaf/0.1/> . > > _:a foaf:mbox <mailto:alice@example.net> . > _:a foaf:name "Alice" . > _:a foaf:nick "WhoMe?" . > > _:b foaf:mbox <mailto:bert@example.net> . > _:b foaf:name "Bert" . > > _:e foaf:mbox <mailto:eve@example.net> . > _:e foaf:nick "DuckSoup" . > -------- > PREFIX foaf: <http://xmlns.com/foaf/0.1/> > > SELECT ?mbox ?name ?nick > { > ?x foaf:mbox ?mbox . > OPTIONAL { ?x foaf:name ?name } . > OPTIONAL { ?x foaf:nick ?nick } . > } > -------- > > giving something like: > > ----------------------------------------------------- > | mbox | name | nick | > ===================================================== > | <mailto:alice@example.net> | "Alice" | "WhoMe?" | > | <mailto:bert@example.net> | "Bert" | | > | <mailto:eve@example.net> | | "DuckSoup" | > ----------------------------------------------------- > > Defining OPTIONAL as "{(?a my:bar ?c) OR TRUE}" > -------- > PREFIX foaf: <http://xmlns.com/foaf/0.1/> > > SELECT ?mbox ?name ?nick > { > ?x foaf:mbox ?mbox . > { ?x foaf:name ?name } OR {} . > { ?x foaf:nick ?nick } OR {} . > } > -------- > > [[ > {} is "TRUE "- it is a graph pattern which imposes no limitation on > the query solutions. SPARQL patterns are limitations on the space of > all possible solutions. > ]] > > Would I be right in saying that this gives something like: > > ----------------------------------------------------- > | mbox | name | nick | > ===================================================== > | <mailto:alice@example.net> | "Alice" | "WhoMe?" | > | <mailto:alice@example.net> | "Alice" | | > | <mailto:alice@example.net> | | "WhoMe?" | > | <mailto:alice@example.net> | | | > | <mailto:bert@example.net> | "Bert" | | > | <mailto:bert@example.net> | | | > | <mailto:eve@example.net> | | "DuckSoup" | > | <mailto:eve@example.net> | | | > ----------------------------------------------------- > > that is, for example, 4 rows for alice because there are 4 ways to > match the pattern against the graph. > > If "OR" only evaluates the RHS is the LHS is false (programming > language style) we are back in order dependent territory. Even if the > two subpatterns are variable compatible: {:x :p ?v } OR { :x :q ?v } > it seems that pattern matching should return all possibilities. > > Defining "OPTIONAL P" as "P OR true" may evaluate as a boolean > expression to be the same but the solutions are different. It is the > a trade-off of application task and computational model, including the > support cost of explaining to people how to achieve their application > task. > > It seems rather hard in to turn the multiple rows form back into the > first form by adding rules about variable usage and doing a sort. > That would seem to be little different to where we are with OPTIONAL. > > What am I missing? I see how you got the result you did, but your evaluation was flawed. Assume that we are defining OPTIONAL as "{(?a my:bar ?c) OR TRUE}" (i.e., using my expansion): We expand out the evaluation of the above query into clausal form (using some abbreviations), yielding: {?mbox =<alice> and (?name="alice" or true) and (?nick="who" or true)} or {?mbox =<bert> and (?name="bert" or true) and (false or true)} or {?mbox =<eve> and (false or true) and (?nick="duck" or true)} This simplifies to {?mbox =<alice> and (?name="alice") and (?nick="who"} or {?mbox =<bert> and (?name="bert") } or {?mbox =<eve> and (?nick="duck")} Which yields exactly the set of bindings that you indicated that you wish to see. If your or someone else's reasoner processes the (... OR true) construct differently on this example, then there is probably something wrong with the way it is implementing disjunction. Next, let me point out how you may have gotten the wrong interpretation. The answer that you suggested my construct would get looks as follows, expanded into clausal form: {?mbox =<alice> and (?name="alice") and (?nick="who"} or {?mbox =<alice> and (?name="alice") } or {?mbox =<alice> and (?nick="who"} or {?mbox =<bert> } or {?mbox =<bert> and (?name="bert") } or {?mbox =<bert> } or {?mbox =<eve> and (?nick="duck")} or {?mbox =<bert> } In fact, this last answer is logically identical to the one just above it. However, some of the disjuncts are subsumed by others, so a simplifier would eliminate them. In other words, from a logic standpoint, these two different solutions are in fact identical. What does that tell us? One thing that it tells us is that the 3 row table and the 8 row table that you listed above contain exactly the same information, when using a standard logical interpretation. However, suppose we decide to treat "UNBOUND" as a value, rather as the absence of a binding. Then our mapping into logic breaks down. This appears to be a mistake that several of the committee members and/or outside participants are making. Repeatedly, I am observing semantics being argued by reference to how a processor might generate an answer. That's not how a logical semantics get defined. Semantics should be argued by an appeal to or mapping to standard logic, as I've just demonstrated. After you have a logical interpretation worked out, then you should be able to safely introduce a value for UNBOUND, since its obvious that something (not a bnode) needs to be returned to occupy the corresponding position in a SELECT clause. I have yet to see any of the discussants make reference to "recursive fixed-point" semantics. Instead, discussions of "order of evaluation" keep popping up (including in the SPARQL document). Recursive fixed-point semantics is, I believe, only one approach to defining a declarative language, but its one that works, whereas appeals to ordering don't have any place in a declarative language framework. You do have a prominent committee member who can provide you with better guidance than I can on how to apply fixed-point semantics to SPARQL. Cheers, Bob
Received on Thursday, 24 March 2005 08:09:54 UTC