Re: Implementations of LCS for OWL from Phillip Lord on 2010-05-04 (public-owl-dev@w3.org from April to June 2010)

From: Phillip Lord <phillip.lord@newcastle.ac.uk>
Date: Tue, 04 May 2010 18:04:16 +0100
To: Chris Mungall <cjm@berkeleybop.org>
Cc: Alan Rector <rector@cs.man.ac.uk>, Owl Dev <public-owl-dev@w3.org>, sonic@tcs.inf.tu-dresden.de
Message-ID: <87zl0flhtb.fsf@newcastle.ac.uk>

Chris Mungall <cjm@berkeleybop.org> writes:
>>> With some knowledge bases that is a reasonable assumption; in other cases
>>> there may be a limited amount of pre-composition or the pre-
>>> composition may be fairly ad-hoc, and allowing class expressions in the LCS
>>> results will give you something more specific and informative.
>>
>>
>> Yes, but you don't need the results to be a class expression in this
>> case. You just need the queries to support class expressions, which is a
>> different kettle of fish.
>
> In some cases you do want CEs in the LCS for additional precision. Consider:


I agree with your examples entirely, and I can understand what you are
trying to achieve. But it stops you from precomputing information
content (of course, you could still precompute for named classes and
work the CEs as you go). I just don't think the gains are worth the
costs. 


> Of course, if intermediate classes such as "2 wheeled vehicle", "wheeled
> vehicle", "motorized vehicle" etc are declared in advance  then returning
> named classes gives the equivalent level of precision.  One strategy would be
> to enumerate all combinations of CEs, feed that  to the reasoner, and then use
> standard LCS techniques. 

Yes, that might work. I'm sure that someone else can comment more
intelligently than I, but I suspect that you can't enumerate them all
because there are too many; for an ontology with one class A and one
relation r, you have

A
A some r A 
A some r (A some r A)

and so on. 



>> Trivial but slow, as far as I can see. If your corpus is large, then
>> have to query against all members of the corpus (ie the instances). In
>> this case, it's worse. The "least" in LCS is defined by the corpus. So,
>> if there are a number of different LCSs which are sibs, then you have to
>> test the information content of them all.
>
> If the lcs function can return a CE that uses intersection then there is a
> maximum of 1 LCS. If the LCS did contain n>1 CEs e1, ..., en then  you could
> just create a CE <e1 and e2 and ... en> which is more  specific. Proof from
> the following paper:
>
>   title={{Computing least common subsumers in description logics}},
>   author={Cohen, W.W. and Borgida, A. and Hirsh, H.},
>   booktitle={Proceedings of the National Conference on Artificial
> Intelligence},
>   pages={754--754},
>   year={1992},
>
> But I think you're right in that in practical terms this could be quite slow
> because current reasoner APIs don't AFAIK allow queries of  the form "how many
> instances satisfy the CE <X>", instead it's  necessary to query for the CE,
> and then count the size of the  resulting set of objects which could be slow
> for large KBs.


And, in most cases, the things that you really care about querying with,
or that you want to see in the results are likely to be there. 

Still, I think that we are in furious agreement about most things; for
the rest, you have to try it.

Phil

Received on Tuesday, 4 May 2010 17:04:59 UTC