RE: ISSUE-3: REPORTED: Lack of anonymous individuals from Boris Motik on 2007-11-08 (public-owl-wg@w3.org from November 2007)

From: Boris Motik <boris.motik@comlab.ox.ac.uk>
Date: Thu, 8 Nov 2007 12:13:51 -0000
To: <gstoil@image.ece.ntua.gr>, <public-owl-wg@w3.org>, "'Carsten Lutz'" <clu@tcs.inf.tu-dresden.de>
Message-ID: <000701c82200$d343cf00$2711a8c0@wolf>
Hello,

Here is an explanation how anonymous individuals in ABoxes correspond to conjunctive queries. I will use a "pidgin" LaTeX
first-order logic notation for this. I'll use _:x for anonymous individuals, and I'll use != for inequality (DifferentFrom)
assertions and & for conjunction.


Imagine you have an ABox A containing the following assertions:

(1)  hor(_:1,_:2)
(2)  ver(_:2,_:3)
(3)  ver(_:1,_:4)
(4)  hor(_:4,_:5)
(5)  _:3 != _:5

Under the traditional semantics, anonymous individuals are actually existentially quantified variables. Hence, the ABox A is
actually equivalent to the following first-order formula \varphi:

(6)  \varphi = \exists x,y,z,v,w : [hor(x,y) & ver(y,z) & ver(x,v) & hor(v,w) & w != z]

Clearly, \varphi can be read as a conjunctive query. Now the "ontology entailment" inference means, given two ontologies O1 and O2,
decide whether O2 semantically follows from O. Now clearly, if we have some ontology O1 and want to check whether the above ABox A
follows from O1, this is the same as answering the conjunctive query \varphi over O1.

Please note that checking ontology entailment is a more general problem than conjunctive query answering: if O2 contains more than
just an ABox, we'll need more than just conjunctive query entailment. 


Let me now also explain why answering conjunctive queries such as \varphi is a problem.

By the definition of semantic entailment, \varphi follows from O1 if A is true in all models of O1. But this is the case if O1 and
the negation of A are unsatisfiable. The negation of A (or \varphi) is the following first-order formula:

(7)  \forall x,y,z,v,w : [hor(x,y) & ver(y,z) & ver(x,v) & hor(v,w) ] -> w = z

This formula is dangerous: it basically says that whenever a model of O1 contains objects arranged as in the following figure, then
the objects 3 and 5 must be the same.

 1 -- hor --> 2
 |            |
 |            |
 |           ver
ver           |
 |            |
 |            V
 |            3
 V             
 4 -- hor --> 5

In other words, the formula (7) allows you to create a "square" in a model of O1 and (7). Furthermore, O1 can then be used to create
an infinite number of squares, which you can then use to encode a range of undecidable problems. In other words, (7) allows you to
break the tree-model property of DLs, which is mainly responsible for achieving decidability.




Now if you interpret anonymous individuals as Skolem constants, then things change. The ABox A is then just a shortcut for the
following ABox A':

(8)  hor(a,b)
(9)  ver(b,c)
(10) ver(a,d)
(11) hor(d,e)
(12) c != e

The individuals a--e are now just constants and not variables. They are "invented" constants (i.e., they don't occur in the original
ABox), but they are still constants.

Checking whether some ontology O1 entails A' is again equivalent to conjunctive query answering; however, in this case, the query
does not contain variables! This is a *much* easier problem that we know how to decide.




Let me give you some idea what kind of practical consequences this would have in practice. Let O1 be an ontology whose ABox contains
unnamed individuals. Furthermore, let O2 be an ontology for which we want to check whether O1 entails O2.

As long as O2 does not contain anonymous individuals, you can't tell what semantics are we using for anonymous individuals in O1
(the "true" one of the approximative one): if O2 is entailed by O1 under one semantics, it will be entailed under the other
semantics as well, and vice versa. The difference between the two semantics becomes important when O2 (the ontology whose entailment
you are checking) contains anonymous individuals.


Let me give you a concrete example. Assume that O1 contains the following ABox assertion:

(13) hasParent(Bob,Mary)

As long as O2 contains named individuals (Bob, Mary, and so on), you will get exactly the same answers. Now let O2 be an ontology
containing the following ABox assertion:

(14) hasParent(Bob,_:1)

Here the difference becomes important. Under the standard "true" semantics, O2 follows from O1. This is because, in first-order
logic, hasParent(Bob,Mary) entails \exists x : hasParent(Bob,x).

Under the approximative semantics, O2 *does not* follow from O1. This is because (14) is actually equivalent to the following ABox
O2':

(14) hasParent(Bob,some-invented-constant)

Now in first-order logic, it is not the case that hasParent(Bob,Mary) entails hasParent(Bob,some-invented-constant), so O1 does not
entail O2.




The last example might suggest that the approximative semantics might be undesirable. I would just like to point out, however, that
the anonymous individuals in the ABox make ABoxes a kind of a query language. It might be cleaner not to mix the two roles of an
ABox. An ABox might be thought of as just the data; for querying, we might use languages such as SPARQL, which would then be
designed appropriately such that we get everything we want (such as the correct answer to (14)) without the drawbacks. For example,
if we modify SPARQL to disallow inequalities in the query, we can be decidable for SHIQ.

Regards,

	Boris 

> -----Original Message-----
> From: gstoil@image.ece.ntua.gr [mailto:gstoil@image.ece.ntua.gr]
> Sent: 08 November 2007 11:13
> To: Boris Motik; public-owl-wg@w3.org; Carsten Lutz
> Subject: Re: ISSUE-3: REPORTED: Lack of anonymous individuals
> 
> Boris Motik <boris.motik@comlab.ox.ac.uk> said:
> 
> >
> > Hello,
> >
> > The OWL 1.1 Member Submission does not contain anonymous individuals for the
> reasons I explain below. These reasons are related to
> > ISSUE-46: Unnamed Individual Restrictions
> (http://www.w3.org/2007/OWL/tracker/issues/46). It might make sense to discuss
> both issues
> > together.
> >
> > In short, we did not include the anonymous individuals into the Member
> Submission because they significantly affect computational
> > aspects of the logic (explained under item 1 below). Furthermore, anonymous
> individuals are usually used in practice with a weaker
> > semantics (explained under item 2 below). Therefore, we did not introduce
> anonymous individuals in the Member Submission and wanted
> > to discuss this in the working group.
> >
> >
> >
> > 1. Why can nontree-like "true" anonymous individuals be dangerous?
> >
> > Nontree-like "true" anonymous individuals in the ABox cause undecidability
> of ontology entailment, which is the basic inference
> > problem for OWL. An ABox containing anonymous individuals can actually be
> understood as a conjunctive query. It is well known that
> 
> Hi Boris,
> 
> Is it a conjunctive query or a union of conjunctive queries?
> 
> BTW, can you explain more how you can view anonymous individuals as CQs?
> 
> Best,
> G. Stoilos
> 
> > answering conjunctive queries over SHOIQ TBoxes is undecidable if you allow
> the query to contain inequalities. Details can be found
> > in the following paper, but I can give further explanation if needed.
> >
> >     Diego Calvanese, Giuseppe De Giacomo, and Maurizio Lenzerini:
> >     Conjunctive query containment and answering under description logics
> constraints,
> >     ACM Trans. on Computational Logic, 2007. To appear.
> >
> > Finally, DifferentIndividuals ABox assertions are actually inequalities. To
> summarize, if you allow nontree-like anonymous
> > individuals in DifferentIndividuals ABox assertions, you easily get
> undecidability of the basic reasoning problem.
> >
> >
> > Even if you were to forbid arbitrary anonymous individuals in the
> DifferentIndividuals assertions, ontology entailment would still
> > require answering conjunctive queries over DL TBoxes. Currently, we only
> know that this problem is decidable for SHIQ; however, it
> > is not clear whether this is the case for SHOIQ as well.
> >
> >
> > The bottom line is that nontree-like "true" anonymous individuals are
> computationally hard, so it might be prudent to avoid them.
> >
> >
> >
> > 2. We could interpret anonymous individuals as Skolem constants
> >
> > In practice, the semantics of anonymous individuals as true existentially
> quantified variables is rarely needed. Usually, anonymous
> > individuals are used just as a convenience, saving the ontology modeler from
> the trouble of inventing a name for the individual. I
> > am not aware of any practical system (OWL or RDF) that implements the
> semantics of anonymous individuals as true existentially
> > quantified variables.
> >
> > Therefore, rather than introducing "true" anonymous individuals, we might
> simply interpret them as Skolem constants. In this case,
> > we do not need the restriction to tree-like connections, and we could indeed
> process a larger fragment of RDF data.
> >
> > Here is a concrete proposal how to reflect this in the specification documents:
> >
> > - The structural specification would be changed to imbue the Individual
> class with an "anonymous" flag. This might be useful for the
> > presentation of an ontology.
> >
> > - The semantics document would make no special provisions for the anonymous
> individuals. Thus, individuals with the "anonymous" flag
> > set would be interpreted as all other individuals.
> >
> > - The mappings to the XML and RDF syntaxes would be extended to update the
> "anonymous" flag correctly during parsing.
> >
> > - The semantics of ontology entailment would not change.
> >
> >
> >
> > Regards,
> >
> > 	Boris
> >
> >
> 
> 
> 
> --
> 
>
Received on Thursday, 8 November 2007 12:14:54 UTC