Re: Demo SPARQL notes from Bijan Parsia on 2007-04-18 (public-semweb-lifesci@w3.org from April 2007)

From: Bijan Parsia <bparsia@cs.man.ac.uk>
Date: Wed, 18 Apr 2007 12:07:49 +0100
To: samwald@gmx.at
Cc: public-semweb-lifesci@w3.org
Message-Id: <CDFE641F-74FD-4C5F-A2C1-BAABC5F7C42D@cs.man.ac.uk>
On Apr 18, 2007, at 2:53 AM, samwald@gmx.at wrote:

>> I think *if the ontology classifies reasonably at all*, then this
>> sort of query approach can achieve reasonable performance for this
>> rough application profile with a reasonable amount of engineering
>> effort in many cases.
>
> Oh, but this is quite an important
> We can expect that most of the ontologies that are based on 'real  
> data' are inconsistent, if not even highly inconsistent -- not  
> because of errors on the side of the ontology designers, but  
> because the represented information is contradictory.

There are two distinct issues: Getting sensible answers out of an  
inconsistent ontology, and being able to process the ontology.

I've done work (extensions of the debugging work; plus there's loads  
of literature on handling inconsistency) on getting sensible answers  
from inconsistent ontologies. But inconsistent ontology vary  
enormously in the *difficulty* of reasoning over them (just as  
consistent ontologies do).

> For example, we have found some inconsistency in one of our  
> SenseLab OWL versions that was caused by the fact that the results  
> of two experiments that were entered into the knowledge base were  
> contradictory. Of course, this is a good example for the utility of  
> an OWL reasoner, because it pointed us to a (potentially  
> interesting or important) contradiction in the literature.
>
> However, such contradictions could lead a reasoning-based approach  
> to querying fail, or at least they can make them less performant,  
> as you said.

I didn't say that :)

However, yes, paraconsistent reasoning (i.e., sensible reasoning) is  
often harder or rather *not done*. One aspect of the hardness can be  
non-computational....you have to decide *what entailments count and  
how*. Some aspects are computational, you may want all possible  
"sensible" entailments, even if contradictory, but want all the  
explanations of each entailment.

(Consider that when you repair an inconsistent ontology you generally  
have a choice of axioms to delete. Each alternative choice might have  
distinct entailments (in swoop's repair tool, the impact analyzer  
will tell you what's lost with each distinct repair plan).  
Traditionally, a paraconsistent consequence relation can be  
skeptical, i.e., you take the intersection of the entailments of ever  
repairs, or credulous, i.e., you take *one* (or the union of) set(s)  
of consequences of the repairs.)

For example, in many description logics, there's a pretty strong  
separation between TBox and ABox such that it's nearly impossible (or  
actually impossible) to make the ontology *inconsistent* with TBox  
statements alone. You can, however, make classes *unsatisfiable* and  
still derive plenty of sensible subsumption. If, as is often the  
case, there is a side informal constraint/expectation that every  
class has at least one instance, then this is a trick to allow a  
limited form of paraconsistent reasoning, one that most of use all  
the time.

Our debugging work shows that for reasonable ontologies (i.e.,  
ontologies we can process at all) we can generally do quite a good  
job of finding explanations (and even do this in a reasoner  
independent way). There are pitfalls, but I think this is not an  
insane statement.

Thus, it is not hopeless *per se* to meet your application demands as  
thus far described to me. (If your ontology makes use of inverses,  
for example, the story may or may not be different.)

For example, C&P have been working on explanation support in Pellet  
(and reasoner independently in the OWL API) for NCI with the current  
thesaurus as the scaling target. Now this is very large >40,000  
classes >500,000 triples (probably much more than that). However,  
it's pretty clear that explanation generation and incremental or near- 
real time reasoning. But then, the thesaurus is currently very  
simple, so basic reasoning is pretty easy (though we had to do work  
to make this so in Pellet):
	<http://clarkparsia.com/weblog/2007/04/10/pellet-classification- 
improvements/>

If you use inverses, then the game gets more difficult.

My take away point is that it's easy to make things fail with scale  
and complexity out of the box, but that sometimes fairly  
straightforward engineering across the board can make it work out  
fine. New research (e.g., modularity, reasoning techniques, etc.) is  
coming down the pike all the time which also keeps me optimistic ;)

Cheers,
Bijan.
Received on Wednesday, 18 April 2007 11:08:10 UTC