- From: Bernard Vatant <bernard.vatant@mondeca.com>
- Date: Fri, 19 Mar 2004 23:02:38 +0100
- To: "SWBPD" <public-swbp-wg@w3.org>
This is a practical question that we have often met in Mondeca. The message below comes from a partner in an European project, developing linguistic tools to generate queries on a semantic knowledge base. To sum up the issue, the question is how to express that the subject (dc:subject) of a document is a concept used as a class in an ontology, e.g "Phd_Theses". My view is that if you don't want to be in OWL-Full, the only way is to make distinct the concept used as class and the concept used as document subject (defined as instance in a thesaurus). The argument against that is that the search engine could leverage the ontology subsumptions to expand queries e.g. from "find documents about publications" to "find documents about PhD Theses" ... more arguments below in Patrizia Paggio message. Best practice for that, folks ? Bernard Vatant Senior Consultant Knowledge Engineering Mondeca - www.mondeca.com bernard.vatant@mondeca.com -----Message d'origine----- De : Patrizia Paggio [mailto:patrizia@cst.dk] Envoye : vendredi 19 mars 2004 11:28 A : Bernard Vatant Cc : Lina Henriksen; CST Objet : Re: Federated questions Dear Bernard since you ask directly for my opinion, here it comes :-) . I think I'm sceptical about the so-called thesaurus solution probably because I don't totally understand why it is smart (alas, in spite of all these email exchanges!). Let me try and explain the way I see things without getting into details with OWL -Full. To take the Webpage on PhD theses, I think we wish to be able to express the fact that the Webpage is also about dissertations, and about publications in general, as predicted by the isa structure: Publication <= Dissertation <= PhD Thesis. This means in my opinion that if the user asks for a Webpage on Publications, the page on PhD Theses should be among the hits. In general, I think it is fair to say that if a document is about a certain university-relevant concept in our ontology, it is also at the same time about the concepts that subsume the concept under consideration. Now, if this is true, it seems to me that if we cannot (or do not want to) allow the Subject class to subsume classes in the ontology in a direct fashion, well then we need to replicate the whole ontology (that is excluding instances) and call it a thesaurus. If this is smart (and possible) - I suppose that's what we should do. As far as the linguistic implementation is concerned, it doesn't make any sense to me to have two versions of the ontology, one of which is used to express subclasses of the Subject concept. As a matter of fact, we couln't even do it because of name clashes. So we would ignore the thesaurus if the thesaurus is the same as (or fragments of) the ontology. By the way, what is a good definition of a thesaurus? ________________________________________________________ Patrizia Paggio Senior Researcher phone: +45 3532 9072 Center for Sprogteknologi fax: +45 3532 9089 Njalsgade 80 email: patrizia@cst.dk 2300-DK CPH S www.cst.dk/patrizia LREC04 Workshop on Multimodal Corpora http://lubitsch.lili.uni-bielefeld.de/MMCORPORA LREC04 OntoLex 2004 http://www.loa-cnr.it/ontolex2004.html ________________________________________________________
Received on Friday, 19 March 2004 17:09:26 UTC