WordNet Task Force - work outline

Dear all,

that's the brief I promised at the telecon 
concerning the WordNet task force (TF-WN? ideas 
for the acronym are well accepted!).

1) Members

Until now, the following WG members have given their availability for this TF:

Guus (on behalf of a colleague :))
Dan Brickley
Libby Miller
Chris Welty
Alberto Reggiori

Please email me if you want to contribute and 
how. As stated in the telecon, this TF ideally 
refers to external contacts and resources.


2) Contacts

A group of people/institutions should be 
contacted to get feedback, evaluation, 
suggestion, even requirements concerning porting 
WordNet to the Semantic Web. I have made a list 
of them and started contacting them. I have made 
three lists according to the kind of interaction 
we might establish with them:

a) Ports of WordNet to SW languages:

- Pure XML port - Bernard Bou, 
http://wnws.sourcefourge.net, a Java servlet 
delivering xml query output
- Pure RDF representation - Stefan Decker, mailto://stefan@isi.edu
- Pure OWL translation - Kilian Stoffel (knOWLer) 
http://taurus.unine.ch/GroupHome/knowler/
- WordNet Metamodel for a Wordnet Web Services - 
Michael Daconta, 
http://www.daconta.net/project_folder/WordnetMetamodel.html

b) Wordnet projects relevant to TF-WN:

- Original Princeton WordNet - Christiane 
Fellbaum, http://cogsci.princeton.edu.
- EuroWordnet - ELRA (European association that 
commercializes multilingual wordnets), 
http://www.elda.fr/catalogue/en/text/M0015.html.
- Global WordNet Association, Piek Vossen, 
http://www.globalwordnet.org/ (organization 
collecting info on wordnets from everywhere, and 
related projects).
- HyperDic Online - Eric Kafé, 
http://www.hyperdic.net/ (a lexicographic 
initiative that uses WordNet 1.7 as a thesaurus. 
A MiniDic for basic English is freely available).
- Miniwordnet - Massimiliano Ciaramita, 
http://www.cog.brown.edu/~massi/res.html (small 
subset of WordNet nouns constituting a common 
sense kernel).

c) WordNet-derived ontologies (this is a possible 
connection to the OPEN Task Force lead by Chris 
and Deborah):

- CoreLex - Paul Buitelaar, http://www.dfki.de/~paulb/corelex.html.
- SUMO-WN - Teknowledge, 
http://ontology.teknowledge.com/ (an alignment of 
WordNet to the SUMO upper ontology).
- Web-KB-2 Philippe Martin, 
http://meganesia.int.gu.edu.au/~phmartin/WebKB/doc/wn/ 
(a re-organization of Wordnet 1.7 nouns in order 
to conform to various logical and formal 
ontological principles, including DOLCE). An OWL 
version exists.
- OntoWordNet - Laboratory for Applied Ontology, 
http://www.loa-cnr.it, has realized: a datatype 
remapping, an alignment, and a top-level 
restructuring of WordNet 1.6 according to 
ontological engineering methods, using the DOLCE 
foundational ontology and its extensions (cf. the 
deliverable D18 of the WonderWeb Project, 
http://wonderweb.semanticweb.org), and is working 
on improving and enriching the resulting ontology 
(cf. the recent related publications in the LOA's 
web site) An OWL version exists (contact me).

d) SW projects using WordNet:

- Fishery Ontology Service - Johannes Keizer, 
Aldo Gangemi, http://www.fao.org/agris/aos. FOS 
has heavily employed OntoWordNet (see above) for 
aligning and merging fishery thesauri used in 
dedicated portals worldwide. The methodology 
adopted resulted successful also to port thesauri 
to the Semantic Web, and in fact this a 
collaboration point with the Task Force on 
transforming thesauri to ontologies coordinated 
by Dan Brickley.


3) Guidelines

This is a preliminary set of guideline proposals 
that I have found important to direct the 
research on transforming WordNet into an ontology 
usable for the Semantic Web. Of course, I don't 
claim that Wordnet won't be usable until *all* 
these guidelines are satisfied by ongoing 
research. On the other hand, as a responsible for 
best practices in this area, I feel obliged to 
stress the many issues at hand when using 
wordnets as SW ontologies.

1) Datatype remapping of WordNet, in order to 
arrive at a preliminary namespace  (what are the 
classes, the individuals, the properties, and how 
many levels are currently collapsed in wordnets?).
2) Clarification of lexical relations, in order 
to refine the WN namespace, especially to 
distinguish between concept-level properties and 
annotation-level properties (cf. Bernard Vatant's 
recent problem on linguistic annotations in 
OWL-DL).
3) How to refine wordnets' hierarchies? what are 
the criteria for establishing the correct detail 
required by a generic vs. specialistic 
applications of wordnets?
4) Alignment to foundational ontologies: pros and cons, limitations, etc.
5) Relation to frame-based wordnets (FrameNet, 
VerbNet): i.e. how to enrich the verbal part of 
WordNet with frame-based structures derived from 
linguistical research?
6) Axiomatization or expansion of glosses: how to 
derive associations and/or axioms from free text 
that can be related to wordnets' elements through 
machine learning and classification techniques?
7) Wordnets' modularization: how to organize the 
global wordnet namespace into domains of interest 
(directory-like subject trees) or in formal 
theories (ontology modules). What semantics 
should be used?
8) Enrichment to domains, possibly in an 
open-source programme: wordnets usually feature a 
poor support for specific domains, being general 
purpose structures. How to expand them through 
open-source initiatives that can also leverage on 
the previous guidelines?
9) How to identify a "basic" level for achieving 
a strong, common sense kernel that can be held as 
both quite general and at an everyday level?


4) Outline for the preliminary report on TF-WN after three months

Members' and contacts' advice will be used to 
prepare a preliminary report on best practices to 
migrate wordnets to ontologies, on common 
problems found, and a list of available 
resources, possibly with a comparison, will be 
prepared.
Possibly, the WG could consider organizing a 
workshop and/or a portal to foster the practices 
and resources retrieved during the task force 
activity.

Cheers
Aldo

-- 



*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*
Aldo Gangemi
Research Scientist
Laboratory for Applied Ontology, ISTC-CNR
Institute of Cognitive Sciences and Technologies
(Laboratorio di Ontologia Applicata,
Istituto di Scienze e Tecnologie della Cognizione,
Consiglio Nazionale delle Ricerche)
Viale Marx 15, 00137
Roma Italy
+3906.86090249
+3906.824737 (fax)
mailto://a.gangemi@istc.cnr.it
mailto://gangemi@acm.org
http://www.loa-cnr.it

Received on Tuesday, 23 March 2004 15:38:24 UTC