[OK?] (entailment, soundness, completeness) from Sergio Tessaris on 2006-03-21 (public-rdf-dawg-comments@w3.org from March 2006)

From: Sergio Tessaris <tessaris@inf.unibz.it>
Date: Tue, 21 Mar 2006 11:11:57 +0100
To: Ian Horrocks <horrocks@cs.man.ac.uk>
Cc: Dan Connolly <connolly@w3.org>, public-rdf-dawg-comments@w3.org
Message-Id: <9273FDA2-A4E1-495B-AE07-185A3C85340C@inf.unibz.it>
---
Please respond indicating whether you are or are not satisfied with
this response. If you are, you can help our issue tracking system by
prefixing the subject of your response with [CLOSED] (where this
subject has [OK?]).
---

In the new version of SPARQL document <http://www.w3.org/TR/rdf- 
sparql-query> query answers are defined in terms of entailment. More  
precisely, in <http://www.w3.org/TR/rdf-sparql-query/ 
#BasicGraphPatternMatching>, by means of the definition of  
"entailment regime" different entailments based on RDF(S) and more  
can be adopted. This should enable SPARQL to scale-up w.r.t. more  
expressive representational languages.

Moreover, the concept of "Scoping Set" allows a SPARQL server to  
restrict the kind of RDF terms to be bind to distinguished variables.  
In particular, bnodes can be prevented to appear in answer sets.

I hope that these changes to the SPARQL document answer to the  
comments expressed in this message.

Cheers,
--sergio

On 11 Sep 2005, at 23:34, Ian Horrocks wrote:

>
> Oops - pressed send button before I had finished writing - sorry.  
> I'll try again...
>
> On 9 Sep 2005, at 22:59, Dan Connolly wrote:
>
>> On Fri, 2005-09-09 at 15:39 -0500, Dan Connolly wrote:
>>> Copying this to public-rdf-dawg comments for tracking purposes.
>>> Follow-up discussion should happen there...
>>>
>>> email message attachment, "Forwarded message - Comments on SPARQL"
>>> On Fri, 2005-09-09 at 15:39 -0500, Dan Connolly wrote:
>>>> Two comments/questions:
>>>>
>>>> Firstly, I strongly support the suggestion to define query  
>>>> answers in
>>>> terms of entailment rather than subgraph. Using an entailment based
>>>> definition has numerous advantages, and no disadvantages that I can
>>>> see.
>>>>
>>>> Entailment based definition:
>>>>
>>>> - is widely used and very well understood
>>>> - is concise, clear and unambiguous
>>>> - builds on existing semantic definitions
>>>> - can deal with a wide variety of languages, including RDF,  
>>>> RDFS, OWL,
>>>> SWRL and FOL, simply by referring to the existing entailment  
>>>> semantics
>>>> for the relevant language
>>
>> I'm not clear on what you mean by that. Do you mean that there's one
>> definition of entailment that SPARQL can use that simultaneously
>> accommodates RDF, RDF, OWL, SWRL, and FOL?
>
> No, not simultaneously. But if query answers are defined in terms  
> of bindings that cause the query to be entailed in all models, then  
> one can defer to the model theory for the semantic conditions that  
> apply to models. (For languages with different underlying model  
> structures, it may also be necessary to adapt the definition of a  
> binding, but this should be *much* less work that developing an  
> independent (of the model theory) definition of query answering.)
>
>>
>> I understand that each of those languages has a clear and unambiguous
>> definition of entailment, but I don't understand how to apply all of
>> them to SPARQL.
>
> Querying a given language simply requires deferring to the relevant  
> model theory (and possibly adapting the definition of a binding for  
> languages that don't share the same model structure as RDF). More  
> powerful languages allow more complex conditions to be imposed;  
> more models can thus be ruled out, and so more answers may be  
> inferred.
>
>>
>> Are you suggesting that SPARQL should have some parameterized version
>> of entailment? I can imagine how that might work, though I'm not
>> aware of much implementation experience with designs like that.
>
> I only intended to say that, by using an entailment based  
> definition of query answers, it would be *much* easier to extend  
> SPARQL to deal with other languages, perhaps requiring little more  
> than referring to the relevant MT.
>
>
>>
>> Bijan provided a quick review of various relevant implementations...
>>
>> OWL editors and their underlying toolkits (triples or other)
>> http://swig.xmlhack.com/2005/09/08/2005-09-08.html#1126195560.647872
>>
>> I have only begun to look into those.
>>
>> If you have any particular designs to recommend, please let us know.
>>
>>
>>>> Subgraph based definition:
>>>>
>>>> - is not widely used, not well understood, and not *known* to  
>>>> work at
>>>> all (at least not for more expressive languages)
>>>> - is verbose, obfuscated (it seems capable of confusing even expert
>>>> logicians), and may even be ambiguous
>>>> - ignores existing semantic definitions
>>>> - cannot easily deal with more expressive languages, and may  
>>>> even be
>>>> incapable of doing so
>>>>
>>>> Can someone please explain to me why the subgraph based  
>>>> definition has
>>>> been preferred?
>>
>> The WG as a whole hasn't expressed a preference directly, but in
>> drafting the definitions and considering simple test cases, the
>> details of subgraph seemed to work out and the details of entailment
>> seemed not to. For example, here's part of one message from 09 Jun  
>> 2005
>>
>>
>> [[[
>> The difference is observable from an approved*** test
>>   http://www.w3.org/2001/sw/DataAccess/tests/#dawg-triple-pattern-001
>>
>> input:
>>  :x :p :v1 .
>>  :x :p :v2 .
>>
>> query:
>>   SELECT *
>> WHERE { :x ?p ?q . }
>>
>> By the simple-entailment definition, there are solutions that bind
>> ?p to _:foo, but there are no such results in the test results.
>
> I saw in another email that this was a typo and you meant ?q to be  
> bound to _:foo.
>
> I believe that a better solution to these kinds of problem is to  
> add some kind of minimality condition as suggested by Enrico (and  
> as used in OQL), i.e., the answer set should not contain a tuple  
> that is less specific than (entailed by) another tuple in the  
> answer set.
>
> Actually, the idea of returning blank nodes in query answers seems  
> a little strange to me - better to use non-distinguished variables  
> (variables that are not necessarily bound to named elements of a  
> model) and perform some kind of "outer union" if combining results  
> with different numbers of distinguished variables. I believe that  
> this approach would also be more easily extensible (i.e., to  more  
> expressive languages).
>
> Ian
>
>
>> I suppose it's possible that the spec could prune the results
>> down to the ones in the test suite some other way, but I can't
>> think of any other straightforward way just now.
>> ]]]
>>  -- Re: Restructure definition of Basic Graph Pattern and pattern  
>> match (sec 2.4)
>> http://lists.w3.org/Archives/Public/public-rdf-dawg/2005AprJun/0359
>
>
>
>
>
>>
>>
>>
>>>> Secondly, IMHO, the minimum requirement for a query language  
>>>> standard
>>>> is that, independently of the features and capabilities of any  
>>>> given
>>>> implementation, it should define what would constitute a sound and
>>>> complete answer for a given language, dataset and query. It is  
>>>> not at
>>>> all clear that, in its current form, SPARQL satisfies this  
>>>> requirement.
>>>>
>>>> Am I missing something?
>>
>> I don't believe you are missing anything; the WG has not looked
>> very deeply at such a requirement. Yours is one of several
>> recent comment that asks that we do so.
>>
>> Due to the volume of comments received, it may be some weeks before
>> we have a reply in substance. Please stand by.
>>
>> -- 
>> Dan Connolly, W3C http://www.w3.org/People/Connolly/
>> D3C2 887B 0F92 6005 C541  0875 0F91 96DE 6E52 C29E
>>
>
Received on Tuesday, 21 March 2006 10:12:05 UTC