- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Thu, 05 Jan 2006 13:57:21 +0000
- To: Pat Hayes <phayes@ihmc.us>
- CC: RDF Data Access Working Group <public-rdf-dawg@w3.org>, Dan Connolly <connolly@w3.org>, Enrico Franconi <franconi@inf.unibz.it>, Sergio Tessaris <tessaris@inf.unibz.it>
Pat Hayes wrote: > Guys, a thought occurs to me regarding the > 'basic' definitions and how to simplify them. > > Now we have got rid of told bnodes, the only > reason for introducing any complexity into the > 'naive definition' of answer binding B, viz > > G (simply) entails B(Q) > > is to get rid of the inevitable redundancy that > this produces when it is read as an 'iff' > definition of answer binding, arising from the > fact that if B is an answer then any C which is > like B but inserts a different bnode instead of > some binding in B, is also an answer, by this > naive criterion. So, if ?x/foo is an answer then > we get infinitely many answers of the form > ?x/_:1, ?x/_:2, etc.. So, we put in this > condition that only bnodes from G may occur, > which itself kind of smells bad because it > requires answer-binding bnodes to be meaningful > outside their scope: but then we still get a > large finite amount of redundancy, as noted by > Sergio > http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JanMar/0001.html, > so we have to tweak the definition with the (G > entails (G union...)) trick, and then we get all > embroiled in exactly how to specify the union, > which is subtle and delicate, and so on. Sigh. > > All of this definitional twisting arises because > we feel a need to somehow get rid of silly > redundancies in the answers. So, here's an > alternative proposal. We don't get rid of them by > tweaking the definition: we leave the definition > naive, think of it as a necessary condition on > answer bindings, and we add a remark about answer > sets, that servers are not obliged to deliver > 'redundant' answers. (Not that they are required > to be non-redundant, but that they are free to > omit redundancy.) And we say explicitly that if A > and B are two answers such that A(Q) is an > instance of B(Q) (in the traditional RDF sense of > instance, ie mapping bnodes to terms) then B is > redundant. So if there are answers ?x/foo and > ?x/_:75, then the second is redundant: and if > there are answers ?x/_:22 and ?x/_:47, then one > of them is redundant provided that the other is > given as an answer. What's more, the test case format defines a test to pass by forming the result set graph and using logical equivalence. http://www.w3.org/2001/sw/DataAccess/tests/README.html """ A test passes if the graph from the action is logically equivalent to the graph named in the result. "Logical equivalence " can be tested by eliminating redundant bNodes in both graphs and testing if the graphs are isomorphic: same shape, same labels. """ Andy > > This completely cuts through the definitional > maze, since we can keep the direct, naive, > unqualified definition of 'answer binding', and > smile happily at the observation that there are > infinitely many answers: yes, there are, but all > but finitely many of them are redundant. It also > means that we can say with a straight face that > bnodes in answers are completely scoped to the > answer, and have no connection to bnodeIDs in the > graph. It also means that redundancy of answers > can be locally checked just by looking at the > answer bindings themselves, not by checking the > entire graph topology (as with the "G entails (G > union...)" trick.) It also, BTW, allows future > extensions of SPARQL which use non-simple > entailment to also use stronger notions of > 'redundant', e.g. to decide that, say, RDFS > tautologies are also redundant, and to omit those > bindings from answer sets. Or not: but the point > is that the flexibility is there, without needing > to mess around with issues like whether or not > this is 'real' RDFS entailment. We have a way to > separate the bounds on answers into lower bounds > which are always some kind of entailment and > upper bounds, which are some kind of > redundancy-avoidance criterion. Within those > bounds, servers must deliver all the answers. > > Now, this does ALLOW a dumb server to be highly > redundant in the answers it gives, like in > Sergio's > > _:a :p _:b > _:b :url <http://example.org> > > example in > http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JanMar/0001.html. > But OK, so it does: redundant answers are not > actually wrong, after all, only redundant: and we > have already considered the idea of requiring > non-redundancy, and rejected it as too high a > standard. I suggest that it might be good to > allow a very simple-minded conforming SPARQL > server to be messily redundant, if such a thing > could be implemented very easily. Normal > commercial pressures will ensure that smarter > SPARQL engines will get written, but we don't > need to mandate that dumb ones should not be > conformant. > > If y'all like this idea, I can do a run through > http://www.w3.org/2001/sw/DataAccess/rq23/ to > tweak the wordings to fit it. It should, if > anything, get simpler. > > Pat > >
Received on Thursday, 5 January 2006 13:57:37 UTC