- From: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
- Date: Tue, 29 Sep 2009 20:18:38 +0100
- To: SPARQL Working Group <public-rdf-dawg@w3.org>
Andy, others, I would like to resolve the issue about handling inconsistencies, so I come back to it... >From your explanations it is still not completely clear to me, how we can specify the behavior for inconsistent graphs properly and I would like to understand the issue better. If I don't understand it well, it is hard to see all the consequences that the changes have. I am ok with a text similar to what the OWL spec says for OWL RL, but it is much more explicit than just saying "A system MAY raise an error if the queried graph is inconsistent". Such a statement does not satisfy the conditions that are placed on the entailment regimes by the SPARQL 1.0 spec. If we have a MAY or a SHOULD (which I much prefer), then we have to explain how we guarantee finite answer sets because the above statement leaves it open for systems to just produce infinitely many answers. I would like to consider a simple example. You have a graph containing :a :b ">"^^rdf:XMLLiteral . :b rdfs:range rdf:XMLLiteral . which is the shortest way to state an inconsistency. Assume the query is ASK { :a rdf:type :c . } The entailment holds by definition of RDFS entailment. If I go to the OWL spec [1] and look at OWL RL entailment checkers (I can equivalently just check whether the graph entails the graph containing just the triple :a rdf:type :c and the data is leagl OWL RL), then I can find that an OWL RL entailment checker MUST NOT return false. It SHOULD return true, but unknown is not excluded and it not necessarily required to terminate. Giving false as an answer is against the spec and would mean the system is unsound, which is not nice. I can imagine a similar definition as for OWL RL adapted to RDFS and RDFS entailment, but I think you want to allow false as an answer, right? Now I try to understand how the system you have in mind would work and please tell me if I am wrong: You are given a graph (some URI of it) that you have not loaded and you have not seen before, so you don't know what data it contains and you don't know that it contains an inconsistency such as the one above. Lets also assume, for simplicity, that your query just contains a single triple pattern, e.g., ?x rdf:type ?y. Now you start loading the triples for that graph and while you load the data, you see if you can find bindings, so if you parse :a rdf:type :b, you take x->:a, y->:b as a solution. Because you don't want to buffer all solutions and let the user wait for them, you return a solution as soon as you find it. Now while you are loading the data (or once you have loaded it), you also apply some rules (after all we use the RDFS entailment regime), e.g., you parse b: rdfs:subClassOf :c, you know that x->:a, y->:c is a solution and you return it. Am I right so far? Assuming I am right so far, I can imagine two cases: you find the inconsistency or you dont. Let's assume first, you find the inconsistency. What will happen? You have already sent some solutions to the client. Again, there are different ways to go. You can now issue a warning and say that there was an inconsistency, but you keep returning answers as if there was no inconsistency. That will obviously terminate (result in finitely many answers) and the user knows that there was an inconsistency that might need fixing (mostly inconsistencies are unintended). You could also not tell the user at all, but that is not nice I think. You can also raise an error at that point and say that the graph contains an inconsistency and that all you said before is entailed, but you will stop giving more answers because everything is trivially entailed. Am i right so far? Any strong preferences? Now let's assume you find something that would be an inconsistency but you don't recognise it, so you go through the graph, you apply some RDFS rule and derive _:1 rdf:type rdf:XMLLiteral and _:1 is assigned to some mal-formed XML literal, say ">"^^rdf:XMLLiteral. That is actually the only pattern for RDFS inconsistencies as I understand it. In this case, you don't recognise the inconsistency because you don't check whether ">" is a valid lexical form. Is that something that you think can happen and should be allowed under RDFS entailment? In that case, you would obviously not give infinitely many answers, but you are incomplete. You possibly return more answers than you would get from simple entailment, but you also didn't apply all RDFS rules, well or you ignored that under RDFS entailment rules you have to check lexical forms of XML literals. If that can happen, then you would most likely answer the ASK query above with "no", right? Now what I am not sure about is, can it happen that you stop giving answers, but you have not even found a triple such as _:1 rdf:type rdf:XMLLiteral with _:1 assigned to a mal-formed XML literal? How can that happen? Do you not apply all (RDFS) rules because you know which ones do not matter for the query? Do you not apply all (RDFS) rules because you in general choose to support only a subset of them? Cheers, Birte [1] http://www.w3.org/2007/OWL/wiki/Conformance#Entailment_Checker -- Dr. Birte Glimm, Room 306 Computing Laboratory Parks Road Oxford OX1 3QD United Kingdom +44 (0)1865 283529
Received on Tuesday, 29 September 2009 19:19:14 UTC