- From: Alan Rector <rector@cs.man.ac.uk>
- Date: Wed, 5 May 2010 08:35:59 +0100
- To: Phillip Lord <phillip.lord@newcastle.ac.uk>
- Cc: Chris Mungall <cjm@berkeleybop.org>, Owl Dev <public-owl-dev@w3.org>, sonic@tcs.inf.tu-dresden.de
Chris I think I may be missing an email. I can't quite put the whole thread together. If you are looking at similarity, is the problem something like: For two classes A and B: a) If A subsumes B, define the set of additional restrictions R whose conjunction with A produces A' equivalent to B, i.e. A & R == B. In general R will not be unique, but there may be at least informal notions of "simplest". b) If neither A nor B subsumes the other, find a "sutiable base" LCS', such that you can compute Ra and Rb, that LCS' & Ra == A and LCS' & Rb == B. In this case the issue of what is a "suitable base", LCS'. Again, some notion of minimalisation or simplest might be used. A variant of the above is to allow R' to be a query rather than a class expression in whichever query language you choose. In this case it would seem natural to me to try to "relax" A and B in some ways to try to find the LCS'. How big that search space might be depends on the expressions A and B. Regards Alan On 4 May 2010, at 15:23, Phillip Lord wrote: > Chris Mungall <cjm@berkeleybop.org> writes: >>> It seems reasonable to me to assume that at the time you want to >>> calculate a semantic similarity, then you have all the three terms >>> that >>> you want -- the two that you wish to compare, and the (unknown, >>> explicitly expressed in the ontology) term that is the LCS. >> >> With some knowledge bases that is a reasonable assumption; in other >> cases >> there may be a limited amount of pre-composition or the pre- >> composition may be fairly ad-hoc, and allowing class expressions in >> the LCS >> results will give you something more specific and informative. > > > Yes, but you don't need the results to be a class expression in this > case. You just need the queries to support class expressions, which > is a > different kettle of fish. This means that you can avoid the > nastiness of > "I want LCS to support class expressions except for the ones that I > don't want like A or B". > > >> >>> I can see a very strong use case why you might want to allow the >>> query >>> terms to not pre-exist, but why the LCS? What semantic similarity >>> measures were you thinking of anyway? The information content based >>> ones will, I think, require that the LCS pre-exist anyway. >> >> I don't think that need be the case. Calculating the IC requires >> finding the >> cardinality of the extent of the LCS, and this can be done >> trivially using >> any OWL reasoner. Of course, there is a closed world assumption >> here but this >> is built into any IC calculation (the well known literature bias). > > Trivial but slow, as far as I can see. If your corpus is large, then > have to query against all members of the corpus (ie the instances). In > this case, it's worse. The "least" in LCS is defined by the corpus. > So, > if there are a number of different LCSs which are sibs, then you > have to > test the information content of them all. > > Phil > ----------------------- Alan Rector Professor of Medical Informatics School of Computer Science University of Manchester Manchester M13 9PL, UK TEL +44 (0) 161 275 6149/6188 FAX +44 (0) 161 275 6204 www.cs.man.ac.uk/~rector www.co-ode.org http://clahrc-gm.nihr.ac.uk/
Received on Wednesday, 5 May 2010 08:01:39 UTC