W3C home > Mailing lists > Public > public-swbp-wg@w3.org > January to March 2004

RE: [OPEN] and/or [PORT] : a practical question

From: Bernard Vatant <bernard.vatant@mondeca.com>
Date: Tue, 23 Mar 2004 18:53:36 +0100
To: "Christopher Welty" <welty@us.ibm.com>
Cc: "SWBPD" <public-swbp-wg@w3.org>
Message-ID: <GOEIKOOAMJONEFCANOKCKEOBDJAA.bernard.vatant@mondeca.com>


> I may be misunderstanding your question,
> but I believe it is quite simple:
> if you want to treat classes as instances you are in OWL Full.
> There is simply no way to do that in DL or Lite ...

I know that :))
So let me put the question otherwise, in terms of best practice.

- Is it worth the trade-off to switch one's ontology (otherwise DL) to OWL-Full, just to
allow its classes to be used as objects in 'dc:subject' predicates?

- Or is it better to stay in OWL-DL using some workaround, like create e.g.
thes:PhD_Thesis as an instance of 'concept' in a Thesaurus namespace distinct from the
namespace of the class, this way :

thes:PhD_Thesis	  	rdf:type   		thes:Concept

ex:Lina_Thesis 		rdf:type 		ex:PhD_Thesis
ex:Lina_Thesis		dc:subject		thes:Computational_Linguistics

ex:Critic_of_Pure_Thesis 	rdf:type 		ex:PhD_Thesis
ex:Critic_of_Pure_Thesis	dc:subject		thes:PhD_Thesis

But one would like to link somehow the class and the concept.
To stay in DL, that could be done through annotation.
Does the following make sense?

ex:PhD_Thesis 	rdfs:isDefinedBy    thes:PhD_Thesis

The rationale in that last one is that terminology comes before ontology. Using the
concept to define a class is just one of its possible uses. Another use of the concept is
for indexation through dc:subject.

I proposed this solution in the quoted project, but my linguist partner considers it too
'messy'. I acknowledge it's a bit subtle, but it allows to stick to OWL-DL while keeping
separate Ontology and Thesaurus management. Things can be asserted about the concept of
PhD_Thesis that are completely independent of properties or instances of the class.

The difficulty for linguistic tools is that it's not obvious for them to distinguish in a
NL request the cases where PhD_Thesis is used as a class (like in 'Where can I find a list
of PhD_Theses in Linguistics?') from those where it's used as a concept (like in 'Who has
written a critical study of PhD_Thesis?')

Is the problem more clearly set this way?
Is not it a BP issue?


Dr. Christopher A. Welty, Knowledge Structures Group
IBM Watson Research Center, 19 Skyline Dr., Hawthorne, NY  10532     USA
Voice: +1 914.784.7055,  IBM T/L: 863.7055, Fax: +1 914.784.7455
Email: welty@watson.ibm.com, Web: http://www.research.ibm.com/people/w/welty/

"Bernard Vatant" <bernard.vatant@mondeca.com>
Sent by: public-swbp-wg-request@w3.org
03/19/2004 05:02 PM To"SWBPD" <public-swbp-wg@w3.org>
Subject[OPEN] and/or [PORT] : a practical question

This is a practical question that we have often met in Mondeca. The message below comes
from a partner in an European project, developing linguistic tools to generate queries on
a semantic knowledge base.

To sum up the issue, the question is how to express that the subject (dc:subject) of a
document is a concept used as a class in an ontology, e.g "Phd_Theses". My view is that if
you don't want to be in OWL-Full, the only way is to make distinct the concept used as
class and the concept used as document subject (defined as instance in a thesaurus).
The argument against that is that the search engine could leverage the ontology
subsumptions to expand queries e.g. from "find documents about publications" to "find
documents about PhD Theses" ... more arguments below in Patrizia Paggio message.

Best practice for that, folks ?

Bernard Vatant
Senior Consultant
Knowledge Engineering
Mondeca - www.mondeca.com

-----Message d'origine-----
De : Patrizia Paggio [mailto:patrizia@cst.dk]
Envoye : vendredi 19 mars 2004 11:28
A : Bernard Vatant
Cc : Lina Henriksen; CST
Objet : Re: Federated questions

Dear Bernard
since you ask directly for my opinion, here it comes :-) .

I think I'm sceptical about the so-called thesaurus solution probably because I don't
totally understand why it is smart (alas, in spite of all these email exchanges!).
Let me try and explain the way I see things without getting into details with OWL -Full.
To take the Webpage on PhD theses, I think we wish to be able to express the fact that the
Webpage is also about dissertations, and about publications in general, as predicted by
the isa structure: Publication <= Dissertation <= PhD Thesis. This means in my opinion
that if the user asks for a Webpage on Publications, the page on PhD Theses should be
among the hits. In general, I think it is fair to say that if a document is about a
certain university-relevant concept in our ontology, it is also at the same time about the
concepts that subsume the concept under consideration.
Now, if this is true, it seems to me that if we cannot (or do not want to) allow the
Subject class to subsume classes in the ontology in a direct fashion, well then we need to
replicate the whole ontology (that is excluding instances) and call it a thesaurus. If
this is smart (and possible) - I suppose that's what we should do.
As far as the linguistic implementation is concerned, it doesn't make any sense to me to
have two versions of the ontology, one of which is used to express subclasses of the
Subject concept. As a matter of fact, we couln't even do it because of name clashes. So we
would ignore the thesaurus if the thesaurus is the same as (or fragments of) the ontology.
By the way, what is a good definition of a thesaurus?


Patrizia Paggio

Senior Researcher                                  phone: +45 3532 9072
Center for Sprogteknologi                 fax:   +45 3532 9089
Njalsgade 80                                                   email: patrizia@cst.dk
2300-DK CPH S                                                   www.cst.dk/patrizia

LREC04 Workshop on Multimodal Corpora

LREC04 OntoLex 2004
Received on Tuesday, 23 March 2004 12:53:47 EST

This archive was generated by hypermail pre-2.1.9 : Tuesday, 23 March 2004 12:53:49 EST