Re: Implementations of LCS for OWL from Alan Rector on 2010-05-05 (public-owl-dev@w3.org from April to June 2010)

From: Alan Rector <rector@cs.man.ac.uk>
Date: Wed, 5 May 2010 08:35:59 +0100
To: Phillip Lord <phillip.lord@newcastle.ac.uk>
Cc: Chris Mungall <cjm@berkeleybop.org>, Owl Dev <public-owl-dev@w3.org>, sonic@tcs.inf.tu-dresden.de
Message-Id: <6E8C9432-49EF-4640-98FE-066FFA80C2D9@cs.man.ac.uk>

Chris

I think I may be missing an email.  I can't quite put the whole thread  
together.

If you are looking at similarity, is the problem something like:

For two classes A and B:
	a) If A subsumes B, define the set of additional restrictions R whose  
conjunction with A produces A' equivalent to B,
	    i.e. A & R == B.
	    In general R will not be unique, but there may be at least  
informal notions of "simplest".

	b) If neither A nor B subsumes the other, find a "sutiable base"  
LCS', such that you can compute Ra and Rb, that
	    LCS' & Ra == A and LCS' & Rb == B.

In this case the issue of what is a "suitable base", LCS'.  Again,  
some notion of minimalisation or simplest might be used.

A variant of the above is to allow R' to be a query rather than a  
class expression in whichever query language
you choose.

In this case it would seem natural to me to try to "relax" A and B in  
some ways to try to find the LCS'.
How big that search space might be depends on the expressions A and B.

Regards

Alan


On 4 May 2010, at 15:23, Phillip Lord wrote:

> Chris Mungall <cjm@berkeleybop.org> writes:
>>> It seems reasonable to me to assume that at the time you want to
>>> calculate a semantic similarity, then you have all the three terms  
>>> that
>>> you want -- the two that you wish to compare, and the (unknown,
>>> explicitly expressed in the ontology) term that is the LCS.
>>
>> With some knowledge bases that is a reasonable assumption; in other  
>> cases
>> there may be a limited amount of pre-composition or the pre-
>> composition may be fairly ad-hoc, and allowing class expressions in  
>> the LCS
>> results will give you something more specific and informative.
>
>
> Yes, but you don't need the results to be a class expression in this
> case. You just need the queries to support class expressions, which  
> is a
> different kettle of fish. This means that you can avoid the  
> nastiness of
> "I want LCS to support class expressions except for the ones that I
> don't want like A or B".
>
>
>>
>>> I can see a very strong use case why you might want to allow the  
>>> query
>>> terms to not pre-exist, but why the LCS? What semantic similarity
>>> measures were you thinking of anyway? The information content based
>>> ones will, I think, require that the LCS pre-exist anyway.
>>
>> I don't think that need be the case. Calculating the IC requires  
>> finding the
>> cardinality of the extent of the LCS, and this can be done   
>> trivially using
>> any OWL reasoner. Of course, there is a closed world  assumption  
>> here but this
>> is built into any IC calculation (the well  known literature bias).
>
> Trivial but slow, as far as I can see. If your corpus is large, then
> have to query against all members of the corpus (ie the instances). In
> this case, it's worse. The "least" in LCS is defined by the corpus.  
> So,
> if there are a number of different LCSs which are sibs, then you  
> have to
> test the information content of them all.
>
> Phil
>

-----------------------
Alan Rector
Professor of Medical Informatics
School of Computer Science
University of Manchester
Manchester M13 9PL, UK
TEL +44 (0) 161 275 6149/6188
FAX +44 (0) 161 275 6204
www.cs.man.ac.uk/~rector
www.co-ode.org
http://clahrc-gm.nihr.ac.uk/

Received on Wednesday, 5 May 2010 08:01:39 UTC