Fwd: Re: Performance issues with OWL Reasoners (Was RE: Playing with sets in OWL...)

All

I forwarded the email to Ian Horrocks, he of reasoner fame, and his 
answer is below.....

>Envelope-to: rstevens@postie
>Cc: Dmitry Tsarkov <tsarkov@cs.man.ac.uk>
>From: Ian Horrocks <horrocks@cs.man.ac.uk>
>Subject: Re: Performance issues with OWL Reasoners (Was RE: Playing 
>with sets in OWL...)
>Date: Thu, 14 Sep 2006 17:53:13 +0100
>To: Robert Stevens <robert.stevens@manchester.ac.uk>
>X-Mailer: Apple Mail (2.624)
>X-Authenticated-Sender: Ian R Horrocks from 
>spc1-rawt1-0-0-cust718.bagu.broadband.ntl.com ([192.168.1.95]) 
>[86.1.138.207]:50357
>X-Authenticated-From: mbassirh@manchester.ac.uk
>X-UoM: Scanned by the University Mail System. See 
>http://www.itservices.manchester.ac.uk/email/filtering/information/ 
>for details.
>X-Spam-Score: -2.6 (--)
>X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) 
>*1GNuS2-0006fh-3F*UPo9Hbro7FA*
>
>Robert,
>
>In answer to 1), it isn't true that most OWL reasoners map OWL instance
>reasoning operations to appropriate SQL queries on the underlying data
>store; in fact to the best of my knowledge, the Instance Store is the
>only reasoner that is even close to this approach, and even here SQL
>queries are used to identify candidate answers which may then need to
>be "filtered" through a full DL reasoner. The technique described by
>Borgida and Brachman is completely different: they show that for a
>terminology defined using a sufficiently simple DL (*much* simpler than
>the logics underlying OWL), it is possible to derive a DB schema such
>that SQL queries can be used to perform ABox retrieval. A more up to
>date version of this idea has been presented by Calvanese et al, who
>have explored the theoretical limits of this approach and devised a
>language called DL-Lite that is "as expressive as possible" while still
>allowing for query answering via SQL (see
>http://www.inf.unibz.it/~calvanese/papers-html/AAAI-2005.html).
>
>Another interesting approach that has only recently been presented by
>Motik et al is to translate a DL terminology into a set of disjunctive
>datalog rules, and to use an efficient datalog engine to deal with
>large numbers of ground facts. This idea has been implemented in the
>Kaon2 system, early results with which have been quite encouraging (see
>http://kaon2.semanticweb.org/). It can deal with expressive languages
>(such as OWL), but it seems to work best in data-centric applications,
>i.e., where the terminology is not too large and complex.
>
>In answer to 2), this has, of course, been the focus of a great deal of
>research. Modern systems are able to cope with very large
>terminologies, e.g., with more than 100,000 classes. There are
>currently two distinct approaches: in the first the logic is restricted
>(although it is still quite expressive) so that reasoning is of worst
>case polynomial complexity, and in the second the logic is much more
>expressive (typically at least equivalent to OWL) but the
>implementation is highly optimised so that it works well in typical
>cases. Currently the only example of the first approach is the CEL
>system; there are several well known examples of the second approach
>including FaCT++, Racer and Pellet (see
>http://www.cs.man.ac.uk/~sattler/reasoners.html).
>
>Hope this helps. Feel free to pass it on.
>
>Ian
>
>
>
>On 14 Sep 2006, at 17:02, Robert Stevens wrote:
>
>>Both
>>
>>see below; have you an answer?
>>
>>robert.
>>
>>>Envelope-to: rstevens@postie
>>>Date: Thu, 14 Sep 2006 11:01:33 -0400
>>>X-MS-Has-Attach:
>>>X-MS-TNEF-Correlator:
>>>Thread-Topic: Performance issues with OWL Reasoners (Was RE: Playing
>>>with sets in OWL...)
>>>Thread-index: AcbV+wbnK7vXliBmRaeR/+GZ3sH+sQCDqL/g
>>>From: "Kashyap, Vipul" <VKASHYAP1@PARTNERS.ORG>
>>>To: "Alan Ruttenberg" <alanruttenberg@gmail.com>,
>>>         "William Bug" <William.Bug@DrexelMed.edu>
>>>Cc: "Miller, Michael D \(Rosetta\)" <Michael_Miller@Rosettabio.com>,
>>>         "Marco Brandizi" <brandizi@ebi.ac.uk>,
>>>         <semantic-web@w3.org>,
>>>         <public-semweb-lifesci@w3.org>
>>>X-OriginalArrivalTime: 14 Sep 2006 15:01:36.0096 (UTC)
>>>FILETIME=[AC726E00:01C6D80E]
>>>X-W3C-Hub-Spam-Status: No, score=-2.2
>>>X-W3C-Scan-Sig: lisa.w3.org 1GNsim-0002qT-LW
>>>586149916cde00dd43517a907c14a4d0
>>>X-Original-To: public-semweb-lifesci@w3.org
>>>Subject: Performance issues with OWL Reasoners (Was RE: Playing with
>>>sets in OWL...)
>>>X-Archived-At:
>>>http://www.w3.org/mid/ 
>>>2BF18EC866AF0448816CDB62ADF6538104C1644D@PHSXMB11.partners.org
>>>Resent-From: public-semweb-lifesci@w3.org
>>>X-Mailing-List: <public-semweb-lifesci@w3.org> archive/latest/1791
>>>X-Loop: public-semweb-lifesci@w3.org
>>>Sender: public-semweb-lifesci-request@w3.org
>>>Resent-Sender: public-semweb-lifesci-request@w3.org
>>>List-Id: <public-semweb-lifesci.w3.org>
>>>List-Help: <http://www.w3.org/Mail/>
>>>List-Unsubscribe:
>>><mailto:public-semweb-lifesci-request@w3.org?subject=unsubscribe>
>>>Resent-Date: Thu, 14 Sep 2006 15:01:51 +0000
>>>X-Spam-Score: -2.2 (--)
>>>X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/)
>>>*1GNsiN-0003uY-48*exo8YBtgpHw*
>>>
>>>
>>>
>>>OWL reasoners support two types of reasoning:
>>>
>>>1. ABox reasoning (reasoning about instance data). Scalability here
>>>is being
>>>achieved here by leveraging relational database technology (which is
>>>acknowledged to be scalable) and mapping OWL instance reasoning
>>>operations to
>>>appropriate SQL queries on the underlying data store. I believe most
>>>OWL
>>>reasoners follow this strategy
>>>
>>>There's an interesting paper by Alex Borgida and Ron Brachman in
>>>SIGMOD 1993
>>>which presents this approach, title "Loading data into description
>>>reasoners"
>>>
>>>2. TBox reasoning scalability is a challenge, especially at the scale
>>>of 100s of
>>>thousands of classes found in medical ontologies. Would love to hear
>>>from DL
>>>experts on this issue.
>>>
>>>---Vipul
>>>
>>>=======================================
>>>Vipul Kashyap, Ph.D.
>>>Senior Medical Informatician
>>>Clinical Informatics R&D, Partners HealthCare System
>>>Phone: (781)416-9254
>>>Cell: (617)943-7120
>>>http://www.partners.org/cird/AboutUs.asp?cBox=Staff&stAb=vik
>>>
>>>To keep up you need the right answers; to get ahead you need the
>>>right questions
>>>---John Browning and Spencer Reiss, Wired 6.04.95
>>> > -----Original Message-----
>>> > From: public-semweb-lifesci-request@w3.org
>>>[mailto:public-semweb-lifesci-
>>> >> > request@w3.org] On Behalf Of Alan Ruttenberg
>>> > Sent: Monday, September 11, 2006 7:35 PM
>>> > To: William Bug
>>> > Cc: Miller, Michael D (Rosetta); Marco Brandizi;
>>>semantic-web@w3.org;
>>> > public-semweb-lifesci@w3.org
>>> > Subject: Re: Playing with sets in OWL...
>>> >
>>> >
>>> > On Sep 8, 2006, at 11:39 PM, William Bug wrote:
>>> >
>>> > >     3) Re:anonymous classes/individuals of the type Alan
>>>describes:
>>> > > These are essentially "blank nodes" in the RDF sense - "unnamed"
>>> > > nodes based on a collection of necessary restrictions, if I
>>> > > understand things correctly.  Please pardon the naive question,
>>>but
>>> > > aren't there some caveats in terms of processing very large RDF
>>>and/
>>> > > or OWL graphs containing "blank" or "anonymous" nodes.  For many
>>> > > OWL ontologies, this might not be a concern, but if one were to be
>>> > > tempted to express a large variety of such sets based on different
>>> > > groupings of the sequence probes on a collection of arrays -
>>> > > groupings relevant to specific types of analysis - I could see how
>>> > > these anonymous entities - especially the anonymous sets of
>>> > > individuals - could really proliferate.
>>> >
>>> > Predicting the performance of even small OWL ontologies is a bit of
>>>a
>>> > crap shoot at the moment, it appears, though there is ongoing
>>> > research to try to address this. In cases I've worked on I've had
>>> > really small ontologies blow up, and larger cases run extremely
>>> > quickly after some solicitation of advise from the DL experts and a
>>> > little experimentation.
>>> >
>>> > I think the best thing in these cases are to try to represent what
>>>is
>>> > desired, see what happens, and ask for help when it doesn't scale as
>>> > desired. Such cases will, at the minimum be grist for future
>>> > research, and I get the sense that they are highly valued by OWL
>>> > researchers.
>>> >
>>> > Although I used an anonymous individual in one of the examples,
>>>there
>>> > is really no need to, and in fact my recommendation would be to
>>>avoid
>>> > their use by generating a name in those cases, taken from a
>>>namespace
>>> > that is advertised to be unresolvable and used for this purpose.
>>>This
>>> > not for reasons of efficiency as much as for understandability - the
>>> > anonymous nodes are properly considered existential variables and
>>> > should probably be used when you know that's what you want.
>>> >
>>> > -Alan
>>> >
>>> >
>>> >
>>> >
>

Dr. Robert Stevens
Senior Lecturer
School of Computer Science
University of Manchester
Oxford Road
Manchester
M13 9PL
+44(0)161 2756251
Email:robert.stevens@manchester.ac.uk

Received on Thursday, 14 September 2006 20:30:19 UTC