- From: Robert Stevens <robert.stevens@manchester.ac.uk>
- Date: Thu, 14 Sep 2006 21:30:14 +0100
- To: public-semweb-lifesci@w3.org
All I forwarded the email to Ian Horrocks, he of reasoner fame, and his answer is below..... >Envelope-to: rstevens@postie >Cc: Dmitry Tsarkov <tsarkov@cs.man.ac.uk> >From: Ian Horrocks <horrocks@cs.man.ac.uk> >Subject: Re: Performance issues with OWL Reasoners (Was RE: Playing >with sets in OWL...) >Date: Thu, 14 Sep 2006 17:53:13 +0100 >To: Robert Stevens <robert.stevens@manchester.ac.uk> >X-Mailer: Apple Mail (2.624) >X-Authenticated-Sender: Ian R Horrocks from >spc1-rawt1-0-0-cust718.bagu.broadband.ntl.com ([192.168.1.95]) >[86.1.138.207]:50357 >X-Authenticated-From: mbassirh@manchester.ac.uk >X-UoM: Scanned by the University Mail System. See >http://www.itservices.manchester.ac.uk/email/filtering/information/ >for details. >X-Spam-Score: -2.6 (--) >X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) >*1GNuS2-0006fh-3F*UPo9Hbro7FA* > >Robert, > >In answer to 1), it isn't true that most OWL reasoners map OWL instance >reasoning operations to appropriate SQL queries on the underlying data >store; in fact to the best of my knowledge, the Instance Store is the >only reasoner that is even close to this approach, and even here SQL >queries are used to identify candidate answers which may then need to >be "filtered" through a full DL reasoner. The technique described by >Borgida and Brachman is completely different: they show that for a >terminology defined using a sufficiently simple DL (*much* simpler than >the logics underlying OWL), it is possible to derive a DB schema such >that SQL queries can be used to perform ABox retrieval. A more up to >date version of this idea has been presented by Calvanese et al, who >have explored the theoretical limits of this approach and devised a >language called DL-Lite that is "as expressive as possible" while still >allowing for query answering via SQL (see >http://www.inf.unibz.it/~calvanese/papers-html/AAAI-2005.html). > >Another interesting approach that has only recently been presented by >Motik et al is to translate a DL terminology into a set of disjunctive >datalog rules, and to use an efficient datalog engine to deal with >large numbers of ground facts. This idea has been implemented in the >Kaon2 system, early results with which have been quite encouraging (see >http://kaon2.semanticweb.org/). It can deal with expressive languages >(such as OWL), but it seems to work best in data-centric applications, >i.e., where the terminology is not too large and complex. > >In answer to 2), this has, of course, been the focus of a great deal of >research. Modern systems are able to cope with very large >terminologies, e.g., with more than 100,000 classes. There are >currently two distinct approaches: in the first the logic is restricted >(although it is still quite expressive) so that reasoning is of worst >case polynomial complexity, and in the second the logic is much more >expressive (typically at least equivalent to OWL) but the >implementation is highly optimised so that it works well in typical >cases. Currently the only example of the first approach is the CEL >system; there are several well known examples of the second approach >including FaCT++, Racer and Pellet (see >http://www.cs.man.ac.uk/~sattler/reasoners.html). > >Hope this helps. Feel free to pass it on. > >Ian > > > >On 14 Sep 2006, at 17:02, Robert Stevens wrote: > >>Both >> >>see below; have you an answer? >> >>robert. >> >>>Envelope-to: rstevens@postie >>>Date: Thu, 14 Sep 2006 11:01:33 -0400 >>>X-MS-Has-Attach: >>>X-MS-TNEF-Correlator: >>>Thread-Topic: Performance issues with OWL Reasoners (Was RE: Playing >>>with sets in OWL...) >>>Thread-index: AcbV+wbnK7vXliBmRaeR/+GZ3sH+sQCDqL/g >>>From: "Kashyap, Vipul" <VKASHYAP1@PARTNERS.ORG> >>>To: "Alan Ruttenberg" <alanruttenberg@gmail.com>, >>> "William Bug" <William.Bug@DrexelMed.edu> >>>Cc: "Miller, Michael D \(Rosetta\)" <Michael_Miller@Rosettabio.com>, >>> "Marco Brandizi" <brandizi@ebi.ac.uk>, >>> <semantic-web@w3.org>, >>> <public-semweb-lifesci@w3.org> >>>X-OriginalArrivalTime: 14 Sep 2006 15:01:36.0096 (UTC) >>>FILETIME=[AC726E00:01C6D80E] >>>X-W3C-Hub-Spam-Status: No, score=-2.2 >>>X-W3C-Scan-Sig: lisa.w3.org 1GNsim-0002qT-LW >>>586149916cde00dd43517a907c14a4d0 >>>X-Original-To: public-semweb-lifesci@w3.org >>>Subject: Performance issues with OWL Reasoners (Was RE: Playing with >>>sets in OWL...) >>>X-Archived-At: >>>http://www.w3.org/mid/ >>>2BF18EC866AF0448816CDB62ADF6538104C1644D@PHSXMB11.partners.org >>>Resent-From: public-semweb-lifesci@w3.org >>>X-Mailing-List: <public-semweb-lifesci@w3.org> archive/latest/1791 >>>X-Loop: public-semweb-lifesci@w3.org >>>Sender: public-semweb-lifesci-request@w3.org >>>Resent-Sender: public-semweb-lifesci-request@w3.org >>>List-Id: <public-semweb-lifesci.w3.org> >>>List-Help: <http://www.w3.org/Mail/> >>>List-Unsubscribe: >>><mailto:public-semweb-lifesci-request@w3.org?subject=unsubscribe> >>>Resent-Date: Thu, 14 Sep 2006 15:01:51 +0000 >>>X-Spam-Score: -2.2 (--) >>>X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) >>>*1GNsiN-0003uY-48*exo8YBtgpHw* >>> >>> >>> >>>OWL reasoners support two types of reasoning: >>> >>>1. ABox reasoning (reasoning about instance data). Scalability here >>>is being >>>achieved here by leveraging relational database technology (which is >>>acknowledged to be scalable) and mapping OWL instance reasoning >>>operations to >>>appropriate SQL queries on the underlying data store. I believe most >>>OWL >>>reasoners follow this strategy >>> >>>There's an interesting paper by Alex Borgida and Ron Brachman in >>>SIGMOD 1993 >>>which presents this approach, title "Loading data into description >>>reasoners" >>> >>>2. TBox reasoning scalability is a challenge, especially at the scale >>>of 100s of >>>thousands of classes found in medical ontologies. Would love to hear >>>from DL >>>experts on this issue. >>> >>>---Vipul >>> >>>======================================= >>>Vipul Kashyap, Ph.D. >>>Senior Medical Informatician >>>Clinical Informatics R&D, Partners HealthCare System >>>Phone: (781)416-9254 >>>Cell: (617)943-7120 >>>http://www.partners.org/cird/AboutUs.asp?cBox=Staff&stAb=vik >>> >>>To keep up you need the right answers; to get ahead you need the >>>right questions >>>---John Browning and Spencer Reiss, Wired 6.04.95 >>> > -----Original Message----- >>> > From: public-semweb-lifesci-request@w3.org >>>[mailto:public-semweb-lifesci- >>> >> > request@w3.org] On Behalf Of Alan Ruttenberg >>> > Sent: Monday, September 11, 2006 7:35 PM >>> > To: William Bug >>> > Cc: Miller, Michael D (Rosetta); Marco Brandizi; >>>semantic-web@w3.org; >>> > public-semweb-lifesci@w3.org >>> > Subject: Re: Playing with sets in OWL... >>> > >>> > >>> > On Sep 8, 2006, at 11:39 PM, William Bug wrote: >>> > >>> > > 3) Re:anonymous classes/individuals of the type Alan >>>describes: >>> > > These are essentially "blank nodes" in the RDF sense - "unnamed" >>> > > nodes based on a collection of necessary restrictions, if I >>> > > understand things correctly. Please pardon the naive question, >>>but >>> > > aren't there some caveats in terms of processing very large RDF >>>and/ >>> > > or OWL graphs containing "blank" or "anonymous" nodes. For many >>> > > OWL ontologies, this might not be a concern, but if one were to be >>> > > tempted to express a large variety of such sets based on different >>> > > groupings of the sequence probes on a collection of arrays - >>> > > groupings relevant to specific types of analysis - I could see how >>> > > these anonymous entities - especially the anonymous sets of >>> > > individuals - could really proliferate. >>> > >>> > Predicting the performance of even small OWL ontologies is a bit of >>>a >>> > crap shoot at the moment, it appears, though there is ongoing >>> > research to try to address this. In cases I've worked on I've had >>> > really small ontologies blow up, and larger cases run extremely >>> > quickly after some solicitation of advise from the DL experts and a >>> > little experimentation. >>> > >>> > I think the best thing in these cases are to try to represent what >>>is >>> > desired, see what happens, and ask for help when it doesn't scale as >>> > desired. Such cases will, at the minimum be grist for future >>> > research, and I get the sense that they are highly valued by OWL >>> > researchers. >>> > >>> > Although I used an anonymous individual in one of the examples, >>>there >>> > is really no need to, and in fact my recommendation would be to >>>avoid >>> > their use by generating a name in those cases, taken from a >>>namespace >>> > that is advertised to be unresolvable and used for this purpose. >>>This >>> > not for reasons of efficiency as much as for understandability - the >>> > anonymous nodes are properly considered existential variables and >>> > should probably be used when you know that's what you want. >>> > >>> > -Alan >>> > >>> > >>> > >>> > > Dr. Robert Stevens Senior Lecturer School of Computer Science University of Manchester Oxford Road Manchester M13 9PL +44(0)161 2756251 Email:robert.stevens@manchester.ac.uk
Received on Thursday, 14 September 2006 20:30:19 UTC