Re: Proposed change to the OWL-2 Direct Semantics entailment regime from Bijan Parsia on 2010-12-21 (public-rdf-dawg@w3.org from October to December 2010)

From: Bijan Parsia <bparsia@cs.man.ac.uk>
Date: Tue, 21 Dec 2010 15:04:08 +0000
To: lenzerini@dis.uniroma1.it
Cc: Enrico Franconi <franconi@inf.unibz.it>, SPARQL Working Group <public-rdf-dawg@w3.org>
Message-Id: <A80E5226-1C60-4E4E-886A-214106E3C7DD@cs.man.ac.uk>

On 21 Dec 2010, at 08:11, Maurizio Lenzerini wrote:

> Hi Bijan,
> 
> thank you for you message.

Thanks for the reply.

> On 12/20/10 6:30 PM, Bijan Parsia wrote:
>> Hi Maurizio,
>> 
>> Thanks for the message!
>> 
>> On 20 Dec 2010, at 09:48, Enrico Franconi wrote:
>> 
>>> I forward this message I received from Maurizio Lenzerini.
>>> 
>>> Begin forwarded message:
>> [snip]
>>>> In all the applications mentioned above, there is a strong need of answering queries with non-distinguished variables. Just to name one interesting scenario where missing non-distinguished variables would be a real problem, consider checking quality/completeness of data.
>>>> 
>>>> The query:
>>>> 
>>>> { x,z | R1(x,y), R2(y,z) }
>>>> 
>>>> tells me which x and z are connected through y, without necessarily knowing who is the y. On the other hand, the query
>>>> 
>>>> { x,y,z | R1(x,y), R2(y,z) }
>>>> 
>>>> tells me for which x,z I KNOW the y.
>> 
>> 
>> As an example, this isn't really very informative. It's a fake toy example which merely illustrates the difference between the two.
> 
> It is not a fake toy example.

Sorry, that came out a bit wrong. In the form you presented it was merely demonstrating an abstract capability rather than an in situ use case. We have tons of abstract examples.

> It is one of the examples showing why the query language should allow pure existential variables in the query.

It can, in principle, show a difference in expressivity (though we need to see the data). What it doesn't really help is show the field use or necessity.

> Say that a customer C is said to be monitored by A (an authority) if it belongs to a group that is monitored by A. I want to know who are the customers monitored by A. The right query is:
> 
> QUERY 1: { x,z | Belongs(x,y), GroupMonitoredBy(y,z) }
> 
> Assume that the result of the query is (C,A): this means that I know that customer C is monitored by authority A, *even if I do NOT know which is the group monitored by A to which C belongs*.

I understand. But for this to have an answer the KB must entail that answer. For this to be a true case of non-distinguished variables, it must do so without a binding for y (otherwise, it's just projection). For this to be compelling as a use case, it has to be reasonable prevalent, useful, usable, implementable, without good work arounds, etc. It's the latter stuff we're trying to determine.

Is it possible for a standard OWL QL data set to have an answer to this query without a binding for y? Is it common?

> Now, suppose that the result of the following query:
> 
> QUERY 2: { x,y,z | Belongs(x,y), GroupMonitoredBy(y,z) }
> 
> does not include the pair (C,A). Now I KNOW that my data is incomplete! And I know it *only because I can compare the result of query 1 with the result of query 2*!

Ok, thanks. I understand. The example is still rather incomplete without the TBox at least (so you're measuring whether the data are complete wrt a schema). This is somewhat specialized, yes? I.e., it's not straightforward query answering but an analytical task?

> Indeed, without the answer to QUERY 1, why should I conclude that my data are incomplete? It is because I know that the pair (C,A) is in the result of QUERY 1 (meaning that there is SOME group G representing the "bridge" between C and A), and not in the result of QUERY 2 (meaning that I do not know such group G) that I have the proof that data are incomplete!

I understand, thanks!

Cheers,
Bijan.

Received on Tuesday, 21 December 2010 15:05:05 UTC