W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > January to March 2006

Re: SPARQL semantics: open issues for basic query patterns

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Thu, 05 Jan 2006 13:57:21 +0000
Message-ID: <43BD25C1.8030109@hp.com>
To: Pat Hayes <phayes@ihmc.us>
CC: RDF Data Access Working Group <public-rdf-dawg@w3.org>, Dan Connolly <connolly@w3.org>, Enrico Franconi <franconi@inf.unibz.it>, Sergio Tessaris <tessaris@inf.unibz.it>



Pat Hayes wrote:
> Guys, a thought occurs to me regarding the 
> 'basic' definitions and how to simplify them.
> 
> Now we have got rid of told bnodes, the only 
> reason for introducing any complexity into the 
> 'naive definition' of answer binding B, viz
> 
> G (simply) entails B(Q)
> 
> is to get rid of the inevitable redundancy that 
> this produces when it is read as an 'iff' 
> definition of answer binding, arising from the 
> fact that if B is an answer then any C which is 
> like B but inserts a different bnode instead of 
> some binding in B, is also an answer, by this 
> naive criterion. So, if ?x/foo is an answer then 
> we get infinitely many answers of the form 
> ?x/_:1, ?x/_:2, etc.. So, we put in this 
> condition that only bnodes from G may occur, 
> which itself kind of smells bad because it 
> requires answer-binding bnodes to be meaningful 
> outside their scope: but then we still get a 
> large finite amount of redundancy, as noted by 
> Sergio 
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JanMar/0001.html, 
> so we have to tweak the definition with the (G 
> entails (G union...)) trick, and then we get all 
> embroiled in exactly how to specify the union, 
> which is subtle and delicate, and so on. Sigh.
> 
> All of this definitional twisting arises because 
> we feel a need to somehow get rid of silly 
> redundancies in the answers. So, here's an 
> alternative proposal. We don't get rid of them by 
> tweaking the definition: we leave the definition 
> naive, think of it as a necessary condition on 
> answer bindings, and we add a remark about answer 
> sets, that servers are not obliged to deliver 
> 'redundant' answers. (Not that they are required 
> to be non-redundant, but that they are free to 
> omit redundancy.) And we say explicitly that if A 
> and B are two answers such that A(Q) is an 
> instance of B(Q) (in the traditional RDF sense of 
> instance, ie mapping bnodes to terms) then B is 
> redundant. So if there are answers ?x/foo and 
> ?x/_:75, then the second is redundant: and if 
> there are answers ?x/_:22 and ?x/_:47, then one 
> of them is redundant provided that the other is 
> given as an answer.

What's more, the test case format defines a test to pass by forming the result 
set graph and using logical equivalence.

http://www.w3.org/2001/sw/DataAccess/tests/README.html

"""
A test passes if the graph from the action is logically equivalent to the 
graph named in the result. "Logical equivalence " can be tested by eliminating 
redundant bNodes in both graphs and testing if the graphs are isomorphic: same 
shape, same labels.
"""

	Andy

> 
> This completely cuts through the definitional 
> maze, since we can keep the direct, naive, 
> unqualified definition of 'answer binding', and 
> smile happily at the observation that there are 
> infinitely many answers: yes, there are, but all 
> but finitely many of them are redundant. It also 
> means that we can say with a straight face that 
> bnodes in answers are completely scoped to the 
> answer, and have no connection to bnodeIDs in the 
> graph. It also means that redundancy of answers 
> can be locally checked just by looking at the 
> answer bindings themselves, not by checking the 
> entire graph topology (as with the "G entails (G 
> union...)" trick.) It also, BTW, allows future 
> extensions of SPARQL which use non-simple 
> entailment to also use stronger notions of 
> 'redundant', e.g. to decide that, say, RDFS 
> tautologies are also redundant, and to omit those 
> bindings from answer sets. Or not: but the point 
> is that the flexibility is there, without needing 
> to mess around with issues like whether or not 
> this is 'real' RDFS entailment. We have a way to 
> separate the bounds on answers into lower bounds 
>  which are always some kind of entailment  and 
> upper bounds, which are some kind of 
> redundancy-avoidance criterion. Within those 
> bounds, servers must deliver all the answers.
> 
> Now, this does ALLOW a dumb server to be highly 
> redundant in the answers it gives, like in 
> Sergio's
> 
> _:a :p _:b
> _:b :url <http://example.org>
> 
> example in 
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JanMar/0001.html. 
> But OK, so it does: redundant answers are not 
> actually wrong, after all, only redundant: and we 
> have already considered the idea of requiring 
> non-redundancy, and rejected it as too high a 
> standard. I suggest that it might be good to 
> allow a very simple-minded conforming SPARQL 
> server to be messily redundant, if such a thing 
> could be implemented very easily. Normal 
> commercial pressures will ensure that smarter 
> SPARQL engines will get written, but we don't 
> need to mandate that dumb ones should not be 
> conformant.
> 
> If y'all like this idea, I can do a run through 
> http://www.w3.org/2001/sw/DataAccess/rq23/ to 
> tweak the wordings to fit it. It should, if 
> anything, get simpler.
> 
> Pat
> 
> 
Received on Thursday, 5 January 2006 13:57:37 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:25 GMT