- From: Pat Hayes <phayes@ihmc.us>
- Date: Tue, 3 Jan 2006 11:32:34 -0600
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
- Cc: Dan Connolly <connolly@w3.org>, Enrico Franconi <franconi@inf.unibz.it>, Sergio Tessaris <tessaris@inf.unibz.it>
Guys, a thought occurs to me regarding the 'basic' definitions and how to simplify them. Now we have got rid of told bnodes, the only reason for introducing any complexity into the 'naive definition' of answer binding B, viz G (simply) entails B(Q) is to get rid of the inevitable redundancy that this produces when it is read as an 'iff' definition of answer binding, arising from the fact that if B is an answer then any C which is like B but inserts a different bnode instead of some binding in B, is also an answer, by this naive criterion. So, if ?x/foo is an answer then we get infinitely many answers of the form ?x/_:1, ?x/_:2, etc.. So, we put in this condition that only bnodes from G may occur, which itself kind of smells bad because it requires answer-binding bnodes to be meaningful outside their scope: but then we still get a large finite amount of redundancy, as noted by Sergio http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JanMar/0001.html, so we have to tweak the definition with the (G entails (G union...)) trick, and then we get all embroiled in exactly how to specify the union, which is subtle and delicate, and so on. Sigh. All of this definitional twisting arises because we feel a need to somehow get rid of silly redundancies in the answers. So, here's an alternative proposal. We don't get rid of them by tweaking the definition: we leave the definition naive, think of it as a necessary condition on answer bindings, and we add a remark about answer sets, that servers are not obliged to deliver 'redundant' answers. (Not that they are required to be non-redundant, but that they are free to omit redundancy.) And we say explicitly that if A and B are two answers such that A(Q) is an instance of B(Q) (in the traditional RDF sense of instance, ie mapping bnodes to terms) then B is redundant. So if there are answers ?x/foo and ?x/_:75, then the second is redundant: and if there are answers ?x/_:22 and ?x/_:47, then one of them is redundant provided that the other is given as an answer. This completely cuts through the definitional maze, since we can keep the direct, naive, unqualified definition of 'answer binding', and smile happily at the observation that there are infinitely many answers: yes, there are, but all but finitely many of them are redundant. It also means that we can say with a straight face that bnodes in answers are completely scoped to the answer, and have no connection to bnodeIDs in the graph. It also means that redundancy of answers can be locally checked just by looking at the answer bindings themselves, not by checking the entire graph topology (as with the "G entails (G union...)" trick.) It also, BTW, allows future extensions of SPARQL which use non-simple entailment to also use stronger notions of 'redundant', e.g. to decide that, say, RDFS tautologies are also redundant, and to omit those bindings from answer sets. Or not: but the point is that the flexibility is there, without needing to mess around with issues like whether or not this is 'real' RDFS entailment. We have a way to separate the bounds on answers into lower bounds which are always some kind of entailment and upper bounds, which are some kind of redundancy-avoidance criterion. Within those bounds, servers must deliver all the answers. Now, this does ALLOW a dumb server to be highly redundant in the answers it gives, like in Sergio's _:a :p _:b _:b :url <http://example.org> example in http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JanMar/0001.html. But OK, so it does: redundant answers are not actually wrong, after all, only redundant: and we have already considered the idea of requiring non-redundancy, and rejected it as too high a standard. I suggest that it might be good to allow a very simple-minded conforming SPARQL server to be messily redundant, if such a thing could be implemented very easily. Normal commercial pressures will ensure that smarter SPARQL engines will get written, but we don't need to mandate that dumb ones should not be conformant. If y'all like this idea, I can do a run through http://www.w3.org/2001/sw/DataAccess/rq23/ to tweak the wordings to fit it. It should, if anything, get simpler. Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32502 (850)291 0667 cell phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Received on Wednesday, 4 January 2006 19:08:50 UTC