Re: Performance issues with OWL Reasoners => subclass vs instance-of from William Bug on 2006-09-15 (public-semweb-lifesci@w3.org from September 2006)

From: William Bug <William.Bug@DrexelMed.edu>
Date: Fri, 15 Sep 2006 09:03:29 -0400
To: Phillip Lord <phillip.lord@newcastle.ac.uk>
Cc: "Kashyap, Vipul" <VKASHYAP1@PARTNERS.ORG>, "chris mungall" <cjm@fruitfly.org>, <semantic-web@w3.org>, <public-semweb-lifesci@w3.org>
Message-Id: <3B6D573E-F30C-4204-BDD1-8AD136C0FAA6@DrexelMed.edu>
Hi All,

Just as a clarification for the less informed - myself included -  
we're discussing the subtle and extremely difficult aspects of  
creating knowledge maps/annotation repositories/KBs/KR repositories  
(what have you) ultimately capable of supporting reasoning (simple  
classification through more complex reasoning) for both UNIVERSALS  
and INSTANCES.


Some DEFINITIONS:

CLASSes represent UNIVERSALs or TYPEs.  The TBox is the set of  
CLASSes and the ASSERTIONs associated with CLASSes.

INSTANCEs represent EXISTENTIALs or INDIVIDUALs instantiating a CLASS  
in the real world.  The ABox is the set of INSTANCEs and the  
ASSERTIONs associated with those INSTANCEs.

Properly specified CLASSes are defined in the context of the  
INSTANCEs whose PROPERTIES and RELATIONs they formally represent.

Properly specified INSTANCEs are defined via their reference to an  
appropriate set of CLASSes.

Reasoners (RacerPro, Pellet, FACT++) generally have optimizations  
specific to either reasoning on the TBox or reasoning on the ABox,  
but it's difficult (i.e., no existing examples experts such as Phil  
and others can cite) to optimize both for reasoning on the TBox, the  
ABox AND - most importantly - TBox + ABox (across these sets).


All of us trying to apply ontology-based formalisms to create machine- 
parsable representations of real world biomedical continuants and  
occurents have banged our heads bloody against this UNIVERSAL- 
EXISTENTIAL border.  Even determining which of the many biomedical  
informatic resources to employ when you seek to reference relevant  
UNIVERSALs can be an very difficult task.  We're in the midst an  
extended debate within the BIRN Ontology Task Force on how best to do  
this for proteins relevant to cross-species representation of  
neurodegenerative disease such as Glial Fibrillary Acidic Protein  
(GFAP)).

I strongly encourage the experts to please clarify, embellish, or  
correct  the above definitions as they see fit for the edification of  
all us disciples.  :-)

Cheers,
Bill


On Sep 15, 2006, at 8:30 AM, Phillip Lord wrote:

>
>>>>>> "KV" == Kashyap, Vipul <VKASHYAP1@PARTNERS.ORG> writes:
>
>   KV> Obviously, if mapping into instances gives better performance
>   KV> for a given set of inferences, that might be the basis of
>   KV> choosing the instance-of relationship.  Towards this end I have
>   KV> the following questions for Phil:
>
>   KV> 1. What are the set of Abox inferences implemented in the GO
>   KV>    example?
>
> In that example, there aren't any. At that stage, the instance store
> was not doing ABox reasoning at all, just TBox, made to look like
> ABox.
>
> The system is richer now, and you can express some relationship
> between individuals in the ABox (as well as any expressivity you like
> in the TBox). But, I don't have details, I am afraid.
>
>
>   KV> 2. What would be the corresponding set of TBox inferences
>   KV>    implemented if the
>   KV> design choice proposed by Chris was adopted, i.e., p53 is a
>   KV> subclass of Gene (assuming a general "Gene" class)
>
> I am presuming by "set of inferences" you mean, what can you express?
> The TBox supports OWL-DL in full. Actually, as the InstanceStore punts
> much of the work to the reasoner, without limits this is constrainted
> by the reasoner not the instancestore per se. So it does what ever you
> reasoner does.
>
>
>
>   KV> 3. What are the performance and scalability implications of (1)
>   KV>    vs (2)
>
> ABox reasoning is harder than TBox. As is the way with DL, exactly
> what the implications are, depends on exactly what you express and I
> am not really an expert.
>
>
>   KV> 4. What are the expressiveness implications of (1) vs (2), i.e.,
>   KV>    can we express
>   KV> some statements using subclass-of based modeling which are not
>   KV> possible using instance-of modeling; or vice versa....
>
>   KV> Look forward to a good use case illustrating the above and
>   KV> discussing its possible consequences.
>
>
> The limitation is that if you're entities are in the ABox in this
> case, there are a very limited number of things that you can say about
> their relationships to other entities in the ABox, although you have
> the full expressivity of OWL to relate them to the TBox. Flip side, is
> that if you put everything into the TBox, then you get nothing from
> the relational backend of the instancestore. In the GO example, for
> instance, you could put all the associations into a reason as modelled
> as OWL classes, but the reasoner will probably not scale to 6 million
> instances.
>
> Separating entities into ABox and TBox depending on how many of them
> there are is, of course, unsatisfying from an ontological perspective,
> but as you are asking about scalability of computational reasoning I
> don't think you have any choice but to be pragmatic.
>
> Phil
>

Bill Bug
Senior Research Analyst/Ontological Engineer

Laboratory for Bioimaging  & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA    19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)


Please Note: I now have a new email - William.Bug@DrexelMed.edu







This email and any accompanying attachments are confidential. 
This information is intended solely for the use of the individual 
to whom it is addressed. Any review, disclosure, copying, 
distribution, or use of this email communication by others is strictly 
prohibited. If you are not the intended recipient please notify us 
immediately by returning this message to the sender and delete 
all copies. Thank you for your cooperation.
Received on Friday, 15 September 2006 13:04:08 UTC