- From: Aldo Gangemi <a.gangemi@istc.cnr.it>
- Date: Tue, 23 Mar 2004 21:22:53 +0100
- To: public-swbp-wg@w3.org
Dear all, that's the brief I promised at the telecon concerning the WordNet task force (TF-WN? ideas for the acronym are well accepted!). 1) Members Until now, the following WG members have given their availability for this TF: Guus (on behalf of a colleague :)) Dan Brickley Libby Miller Chris Welty Alberto Reggiori Please email me if you want to contribute and how. As stated in the telecon, this TF ideally refers to external contacts and resources. 2) Contacts A group of people/institutions should be contacted to get feedback, evaluation, suggestion, even requirements concerning porting WordNet to the Semantic Web. I have made a list of them and started contacting them. I have made three lists according to the kind of interaction we might establish with them: a) Ports of WordNet to SW languages: - Pure XML port - Bernard Bou, http://wnws.sourcefourge.net, a Java servlet delivering xml query output - Pure RDF representation - Stefan Decker, mailto://stefan@isi.edu - Pure OWL translation - Kilian Stoffel (knOWLer) http://taurus.unine.ch/GroupHome/knowler/ - WordNet Metamodel for a Wordnet Web Services - Michael Daconta, http://www.daconta.net/project_folder/WordnetMetamodel.html b) Wordnet projects relevant to TF-WN: - Original Princeton WordNet - Christiane Fellbaum, http://cogsci.princeton.edu. - EuroWordnet - ELRA (European association that commercializes multilingual wordnets), http://www.elda.fr/catalogue/en/text/M0015.html. - Global WordNet Association, Piek Vossen, http://www.globalwordnet.org/ (organization collecting info on wordnets from everywhere, and related projects). - HyperDic Online - Eric Kafé, http://www.hyperdic.net/ (a lexicographic initiative that uses WordNet 1.7 as a thesaurus. A MiniDic for basic English is freely available). - Miniwordnet - Massimiliano Ciaramita, http://www.cog.brown.edu/~massi/res.html (small subset of WordNet nouns constituting a common sense kernel). c) WordNet-derived ontologies (this is a possible connection to the OPEN Task Force lead by Chris and Deborah): - CoreLex - Paul Buitelaar, http://www.dfki.de/~paulb/corelex.html. - SUMO-WN - Teknowledge, http://ontology.teknowledge.com/ (an alignment of WordNet to the SUMO upper ontology). - Web-KB-2 Philippe Martin, http://meganesia.int.gu.edu.au/~phmartin/WebKB/doc/wn/ (a re-organization of Wordnet 1.7 nouns in order to conform to various logical and formal ontological principles, including DOLCE). An OWL version exists. - OntoWordNet - Laboratory for Applied Ontology, http://www.loa-cnr.it, has realized: a datatype remapping, an alignment, and a top-level restructuring of WordNet 1.6 according to ontological engineering methods, using the DOLCE foundational ontology and its extensions (cf. the deliverable D18 of the WonderWeb Project, http://wonderweb.semanticweb.org), and is working on improving and enriching the resulting ontology (cf. the recent related publications in the LOA's web site) An OWL version exists (contact me). d) SW projects using WordNet: - Fishery Ontology Service - Johannes Keizer, Aldo Gangemi, http://www.fao.org/agris/aos. FOS has heavily employed OntoWordNet (see above) for aligning and merging fishery thesauri used in dedicated portals worldwide. The methodology adopted resulted successful also to port thesauri to the Semantic Web, and in fact this a collaboration point with the Task Force on transforming thesauri to ontologies coordinated by Dan Brickley. 3) Guidelines This is a preliminary set of guideline proposals that I have found important to direct the research on transforming WordNet into an ontology usable for the Semantic Web. Of course, I don't claim that Wordnet won't be usable until *all* these guidelines are satisfied by ongoing research. On the other hand, as a responsible for best practices in this area, I feel obliged to stress the many issues at hand when using wordnets as SW ontologies. 1) Datatype remapping of WordNet, in order to arrive at a preliminary namespace (what are the classes, the individuals, the properties, and how many levels are currently collapsed in wordnets?). 2) Clarification of lexical relations, in order to refine the WN namespace, especially to distinguish between concept-level properties and annotation-level properties (cf. Bernard Vatant's recent problem on linguistic annotations in OWL-DL). 3) How to refine wordnets' hierarchies? what are the criteria for establishing the correct detail required by a generic vs. specialistic applications of wordnets? 4) Alignment to foundational ontologies: pros and cons, limitations, etc. 5) Relation to frame-based wordnets (FrameNet, VerbNet): i.e. how to enrich the verbal part of WordNet with frame-based structures derived from linguistical research? 6) Axiomatization or expansion of glosses: how to derive associations and/or axioms from free text that can be related to wordnets' elements through machine learning and classification techniques? 7) Wordnets' modularization: how to organize the global wordnet namespace into domains of interest (directory-like subject trees) or in formal theories (ontology modules). What semantics should be used? 8) Enrichment to domains, possibly in an open-source programme: wordnets usually feature a poor support for specific domains, being general purpose structures. How to expand them through open-source initiatives that can also leverage on the previous guidelines? 9) How to identify a "basic" level for achieving a strong, common sense kernel that can be held as both quite general and at an everyday level? 4) Outline for the preliminary report on TF-WN after three months Members' and contacts' advice will be used to prepare a preliminary report on best practices to migrate wordnets to ontologies, on common problems found, and a list of available resources, possibly with a comparison, will be prepared. Possibly, the WG could consider organizing a workshop and/or a portal to foster the practices and resources retrieved during the task force activity. Cheers Aldo -- *;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;* Aldo Gangemi Research Scientist Laboratory for Applied Ontology, ISTC-CNR Institute of Cognitive Sciences and Technologies (Laboratorio di Ontologia Applicata, Istituto di Scienze e Tecnologie della Cognizione, Consiglio Nazionale delle Ricerche) Viale Marx 15, 00137 Roma Italy +3906.86090249 +3906.824737 (fax) mailto://a.gangemi@istc.cnr.it mailto://gangemi@acm.org http://www.loa-cnr.it
Received on Tuesday, 23 March 2004 15:38:24 UTC