- From: William Bug <William.Bug@DrexelMed.edu>
- Date: Fri, 15 Sep 2006 09:03:29 -0400
- To: Phillip Lord <phillip.lord@newcastle.ac.uk>
- Cc: "Kashyap, Vipul" <VKASHYAP1@PARTNERS.ORG>, "chris mungall" <cjm@fruitfly.org>, <semantic-web@w3.org>, <public-semweb-lifesci@w3.org>
- Message-Id: <3B6D573E-F30C-4204-BDD1-8AD136C0FAA6@DrexelMed.edu>
Hi All, Just as a clarification for the less informed - myself included - we're discussing the subtle and extremely difficult aspects of creating knowledge maps/annotation repositories/KBs/KR repositories (what have you) ultimately capable of supporting reasoning (simple classification through more complex reasoning) for both UNIVERSALS and INSTANCES. Some DEFINITIONS: CLASSes represent UNIVERSALs or TYPEs. The TBox is the set of CLASSes and the ASSERTIONs associated with CLASSes. INSTANCEs represent EXISTENTIALs or INDIVIDUALs instantiating a CLASS in the real world. The ABox is the set of INSTANCEs and the ASSERTIONs associated with those INSTANCEs. Properly specified CLASSes are defined in the context of the INSTANCEs whose PROPERTIES and RELATIONs they formally represent. Properly specified INSTANCEs are defined via their reference to an appropriate set of CLASSes. Reasoners (RacerPro, Pellet, FACT++) generally have optimizations specific to either reasoning on the TBox or reasoning on the ABox, but it's difficult (i.e., no existing examples experts such as Phil and others can cite) to optimize both for reasoning on the TBox, the ABox AND - most importantly - TBox + ABox (across these sets). All of us trying to apply ontology-based formalisms to create machine- parsable representations of real world biomedical continuants and occurents have banged our heads bloody against this UNIVERSAL- EXISTENTIAL border. Even determining which of the many biomedical informatic resources to employ when you seek to reference relevant UNIVERSALs can be an very difficult task. We're in the midst an extended debate within the BIRN Ontology Task Force on how best to do this for proteins relevant to cross-species representation of neurodegenerative disease such as Glial Fibrillary Acidic Protein (GFAP)). I strongly encourage the experts to please clarify, embellish, or correct the above definitions as they see fit for the edification of all us disciples. :-) Cheers, Bill On Sep 15, 2006, at 8:30 AM, Phillip Lord wrote: > >>>>>> "KV" == Kashyap, Vipul <VKASHYAP1@PARTNERS.ORG> writes: > > KV> Obviously, if mapping into instances gives better performance > KV> for a given set of inferences, that might be the basis of > KV> choosing the instance-of relationship. Towards this end I have > KV> the following questions for Phil: > > KV> 1. What are the set of Abox inferences implemented in the GO > KV> example? > > In that example, there aren't any. At that stage, the instance store > was not doing ABox reasoning at all, just TBox, made to look like > ABox. > > The system is richer now, and you can express some relationship > between individuals in the ABox (as well as any expressivity you like > in the TBox). But, I don't have details, I am afraid. > > > KV> 2. What would be the corresponding set of TBox inferences > KV> implemented if the > KV> design choice proposed by Chris was adopted, i.e., p53 is a > KV> subclass of Gene (assuming a general "Gene" class) > > I am presuming by "set of inferences" you mean, what can you express? > The TBox supports OWL-DL in full. Actually, as the InstanceStore punts > much of the work to the reasoner, without limits this is constrainted > by the reasoner not the instancestore per se. So it does what ever you > reasoner does. > > > > KV> 3. What are the performance and scalability implications of (1) > KV> vs (2) > > ABox reasoning is harder than TBox. As is the way with DL, exactly > what the implications are, depends on exactly what you express and I > am not really an expert. > > > KV> 4. What are the expressiveness implications of (1) vs (2), i.e., > KV> can we express > KV> some statements using subclass-of based modeling which are not > KV> possible using instance-of modeling; or vice versa.... > > KV> Look forward to a good use case illustrating the above and > KV> discussing its possible consequences. > > > The limitation is that if you're entities are in the ABox in this > case, there are a very limited number of things that you can say about > their relationships to other entities in the ABox, although you have > the full expressivity of OWL to relate them to the TBox. Flip side, is > that if you put everything into the TBox, then you get nothing from > the relational backend of the instancestore. In the GO example, for > instance, you could put all the associations into a reason as modelled > as OWL classes, but the reasoner will probably not scale to 6 million > instances. > > Separating entities into ABox and TBox depending on how many of them > there are is, of course, unsatisfying from an ontological perspective, > but as you are asking about scalability of computational reasoning I > don't think you have any choice but to be pragmatic. > > Phil > Bill Bug Senior Research Analyst/Ontological Engineer Laboratory for Bioimaging & Anatomical Informatics www.neuroterrain.org Department of Neurobiology & Anatomy Drexel University College of Medicine 2900 Queen Lane Philadelphia, PA 19129 215 991 8430 (ph) 610 457 0443 (mobile) 215 843 9367 (fax) Please Note: I now have a new email - William.Bug@DrexelMed.edu This email and any accompanying attachments are confidential. This information is intended solely for the use of the individual to whom it is addressed. Any review, disclosure, copying, distribution, or use of this email communication by others is strictly prohibited. If you are not the intended recipient please notify us immediately by returning this message to the sender and delete all copies. Thank you for your cooperation.
Received on Friday, 15 September 2006 13:04:10 UTC