[WNET] Fwd: Re: W3C Task Force on Porting Wordnet to the Semantic Web from Aldo Gangemi on 2004-04-30 (public-swbp-wg@w3.org from April 2004)

From: Aldo Gangemi <a.gangemi@istc.cnr.it>
Date: Sat, 1 May 2004 00:05:51 +0100
To: public-swbp-wg@w3.org
Cc: phmartin@meganesia.sci.griffith.edu.au
Message-Id: <p06002002bcb88ab2e847@[212.34.219.175]>
As promised at the last telecon, I am forwarding the message I have 
received from Philippe Martin (one of the WordNet experts the I have 
contacted on behalf of the WordNet Task Force), who comments on the 
issues raised so far.

Aldo

>Date: Mon, 19 Apr 2004 14:34:34 +1000
>From: Philippe Martin <phmartin@meganesia.sci.griffith.edu.au>
>Subject: Re: W3C Task Force on Porting Wordnet to the Semantic Web
>To: Aldo Gangemi <a.gangemi@istc.cnr.it>
>Cc: stefan@isi.edu, Kilian.Stoffel@unine.ch, mike@daconta.net,
>  Piek.Vossen@irion.nl, massi@cog.brown.edu, paulb@dfki.de,
>  phmartin@meganesia.sci.griffith.edu.au, moldovan@utdallas.edu,
>  velardi@dsi.uniroma1.it, jc@hplb.hpl.hp.com, schreiber@cs.vu.nl,
>  danbri@w3.org, Libby.Miller@bristol.ac.uk, brian.mcbride@hp.com,
>  danny666@virgilio.it, welty@us.ibm.com, fellbaum@clarity.princeton.edu,
>  oltramari@loa-cnr.it, phmartin@meganesia.int.gu.edu.au
>
>
>Aldo,
>

<snip>

>I have some comments on some questions/notes from
>http://lists.w3.org/Archives/Public/public-swbp-wg/2004JanMar/thread.html#114
>I tried to mail these comments to the public-swbp-wg@w3.org list on
>Friday but, although the mail server made me fill a form and then claimed to
>have distributed my message, this message does not appear in the archive
>(http://lists.w3.org/Archives/Public/public-swbp-wg/2004Apr/index.html)
>and hence I assume it has not been distributed.
>I include my message below for you to forward if its content is adequate.
>

<snip>

>
>==========================================================================
>
>
>>  2) Clarification of lexical relations, ...
>>  3) How to refine wordnets' hierarchies? ...
>>  an OWL ontology to talk about the main wordnet relationships
>
>Here are some kinds of "links" (relations between categories,
>i.e. between types or indivisuals) that I had to introduce when
>converting the noun-related part of WordNet 1.7 into a genuine
>lexical ontology usable for knowledge-based management purposes
>(details at http://www.webkb.org/doc/wn/):
>- to replace the hyponym links: subtypeOf and instanceOf;
>- to replace the antonym links: simpleExclusionOf and
>   closedExclusionOf (like simpleExclusionOf but in addition,
>   the linked categories form a "closed subtype partition" for
>   another type, e.g. wn#chromatic_color and wn#achromatic_color
>   form a closed subtype partition for wn#color);
>   the complementOf and inverseOf links are not currently useful
>   within WordNet but are common (e.g. they exist in OWL);
>- to replace certain erroneous uses of the hyponym link: the
>   locationOf link (e.g. wn#Dunkerque is said to be hyponym of both
>   wn#city and wn#amphibious_assault; this violates some exclusionOf
>   link(s) in my top-level ontology; I have set wn#Dunkerque as
>   instanceOf wn#city and locationOf wn#amphibious_assault).
>
>Other kinds of links are useful to extend WordNet:
>instrumentOf, inputOf, objectOf, agentOf, resultOf, etc.
>
>
>
>
>>  > > But also look at the file at http://xmlns.com/wordnet/1.6/City:
>>  > > City is a class introduced with all its taxonomic branch (poor
>>  > > practice: if each class is introduced with all its superclasses,
>>  > > the ontology results unnecessary long) then all hyponyms of "City"
>>  > > are introduced, ...
>>  > If all you want to know about is City then the City download is a
>>  > good one, if you want to know about more than that, maybe you need
>>  > the full download (wherever that is). ...
>>  Got the point, it is an efficiency issue. Is there any tool to
>>  convert an OWL (or RDF) ontology into a set of "views files", possibly
>>  based on customizable properties (e.g., "give me only a superclass
>>  specification view", or "give me subclasses and related classes
>>  specification view"?
>
>>  I think this is a very promising area, and could perhaps be
>>  generalized into a system that was given the URI of a word or
>>  document or a Literal (i.e. a node or small graph) and returned
>>  a graph containing related nodes.
>  > (Jena + Lucene might provide a lot of of what's required)
>
>Indeed, storing a "view file" for each WordNet category and each
>exploration/presentation option of its relation to other categories
>is not practical. A knowledge server is needed. To illustrate some
>options, I'll use my own system (WebKB-2) as an example.
>
>http://www.webkb.org/bin/categSearch.cgi?categ=%23city&format=RDF
>
>Value for the "categ" parameter: either a category identifier such as
>"#city" (GET encoding: %23city;  "wn#city" and "#city" are equivalent
>in WebKB-2: WordNet is the default source/creator) or a word such as
>"city" which may have more than one recorded meaning (in which case,
>more than one "category and associated relations" is presented).
>Since generating short and intuitive identifiers for the WordNet
>categories is the first (and most important) task for its re-use for
>ontology-based applications, I am a bit surprised that this issue was
>not brought up in the discussion at
>http://lists.w3.org/Archives/Public/public-swbp-wg/2004JanMar/thread.html#114
>
>Value for the "format" parameter: there should be more than one
>possible format because (i) RDF/XML is inadequate for vizualizing or
>navigating a large semantic network, (ii) RDF/XML+OWL is inadequate
>for many other purposes (natural language representation, interlingua,
>development and representation of reasonably complex ontologies, etc.).
>WebKB-2 currently proposes four other formats for presenting links
>between categories and navigating them. For example, try:
>http://www.webkb.org/bin/categSearch.cgi?categ=city&recursLink=%3C&hyperlinks
>("%3C" is the GET encoding for '<', i.e. the subtypeOf link; thus,
>"recursLink=%3C" asks for the display of the whole subtypeOf hierarchy
>unless other presentation constraints limit that exploration, e.g. an
>exploration depth). See the "category search" interface for more
>options: http://www.webkb.org/interface/categSearch.html
>
>
>
>>  For example, anyone creating an OWL or RDFS class might wish to
>>  annotate it with its intended meaning using *the* URI for a specific
>>  sense of an  English word, as classified by wordnet. The main
>>  requirement from this use case is agreement over what that URI is,
>>  including the beginning bit (the namespace) and the end bit (the
>>  mapping from Wordnet's representation of senses) ...
>
>There will be more than one possible URI for a category even if only one
>knowledge server is exploited:
>- if URLs such as http://www.webkb.org/bin/categSearch.cgi?categ=%23city
>   are used, many options may be added (hence leading to different URLs);
>   if artificial URIs such as http://www.webkb.org/wn#city are used (this
>   URI does not refer to any existing document), the problem for
>   knowledge processing tools is then to find the URL of the knowledge
>   server to get more information;
>- a category may have many identifiers, e.g. in WebKB-2 the following
>   ones are equivalent: #city, wn#city, #city__metropolis__urban_center;
>- a category may be linked to other categories by equivalentTo/equalTo
>   links, e.g. wn#city = pm#city = some_french_ontology#ville; since
>   these categories are said to be equivalent/identical, asking for links
>   connected to one of them should lead to the same result as asking for
>   links connected to any other one of them. This is not a problem
>   within a knowledge server (e.g. WebKB-2) but if more than one
>   knowledge server is used, there must be some regular
>   mirroring/replication/propagation processes between the servers.
>
>
>>  7) Wordnets' modularization: how to organize the global wordnet
>>  namespace into domains of interest (directory-like subject trees) or
>>  in formal theories (ontology modules). What semantics should be used?
>>  8) Enrichment to domains, possibly in an open-source programme:
>>  wordnets usually feature a poor support for specific domains, being
>>  general purpose structures. How to expand them through open-source ...
>
>Indeed, most knowledge-based uses of WordNet will lead to additions (or
>sometimes corrections) and these extensions should be sharable.
>
>Independently developed extensions stored in static Web documents are
>difficult to merge (manually, and even more automatically) in a
>semantically/logically/ontologically correct way and hence
>hard to re-use for genuine knowledge based management purposes.
>
>In my opinion, a more scalable solution (for knowledge based management
>purposes and "Semantic Web" purposes) is the use of knowledge servers
>where each server
>- associates each user's addition (e.g. a category, a link between
>   categories, a statement) with its source/creator (ontology/user),
>   removes or warns about introduced redundancies, and
>   rejects the addition if it introduces an inconsistency (with the
>   protocols used in WebKB-2, this does not prevent a user to express
>   her own beliefs but this lets her know about a conflict with other
>   users' statements and leads her to be more precise/explicit to
>   eliminate the problem; no discussion or agreement between the users
>   is necessary);
>- periodically checks other servers (similar general servers, or servers
>   specialized in the same domains) to complement its knowledge base
>   (thus, it does not matter much which servers people use, and this
>    is how the centralized and distributed approaches combine their
>    advantages; of course, integrating knowledge from other servers
>    may not be obvious but it is *much* easier than trying to
>    re-use/integrate dozens/hundreds/thousands of poorly inter-connected
>    extensions).
>More details on this are accessible from
>http://www.webkb.org/doc/papers/wi02/
>
>I do not think that people should have to explore a hierarchy of
>ontologies (modules) or "domains of interest" to decide which
>modules to search, re-use, extend or merge, especially since
>these modules may be conflicting, redundant and complementary.
>Imagine someone wanting to search or add some knowledge about certain
>kinds of "neurons", or "cats" or "feet". There are (too) many domains
>this knowledge can be categorized into and (too) many tasks it can be
>useful for. People should be able to use a general knowledge server to
>find the different meanings of (categories for) "neurons", "cats" or
>"feet", and explore their specializations (or other related categories)
>until they have found the right category (or the closest).
>Ideally, the knowledge statements (or the "data" for a database related
>analogy) associated to a category should also be organized via
>categories, e.g. a category #Siamese_cat might be linked to a
>category pm#health_issues_of_siamese_cats.
>If the URL of a specialized knowledge server is associated to a
>sufficiently adequate category (e.g. via a link "serverFor"), the user
>should continue its exploration on this specialized server (if there are
>several possible specialized servers, ideally they mirror each other to
>share the same knowledge about this category, i.e. the same collection
>of statements related to that category). When the creators of a
>specialized server register to have their server associated to a
>category in a general server, they commit to try to collect and
>structure as much knowledge as possible about that category (e.g.
>about pm#health_issues_of_siamese_cats, or more ambitiously, about
>#Siamese_cat; the server about #Siamese_cat may of course refer to the
>server about pm#health_issues_of_siamese_cats). (In a commercial
>environment, it is in the interest of the creators of a server to do
>some checking and refer only to the most comprehensive of more
>specialized servers, hence easing users' information retrieval).
>In addition to permit search by navigation, a server may also permit
>querying and hence may exploit more specialized servers for that.
>
>I should also note that since in a knowledge server the source/creator
>of each piece of knowledge is stored, knowledge modules can be
>re-generated. Actually, arbitrary complex queries on the knowledge
>(its content plus its creators and their characteristics) can be used
>to generate knowledge modules. However, it is easier and better to make
>additions to the knowledge base of a knowledge server by creating a
>static module (a Web document), submitting it to the server and refining
>it until its content is stable and accepted by the server (and commited
>into its knowledge base) than trying to do incremental on-line additions
>and removals (a loose analogy of that would be trying to develop a large
>shell script via the shell command line, i.e. without text editor).
>Furthermore, the links to those static modules may serve various
>purposes: documentation of the knowledge, regeneration of the knowledge
>base in case it gets corrupted, etc. Thus, modules (distributed static
>files) are useful but their content should be developped and exploited
>via knowledge servers (i.e. some means of centralization) to permit
>knowledge sharing/re-use.
>
>
>Philippe
>
>______________________________________________________________________
>Dr. Philippe Martin
>Address: Griffith Uni, School of I.T., PMB 50 GCMC, QLD 9726 Australia
>Email: phmartin@gu.edu.au;  Fax: +61 7 5552 8066
>______________________________________________________________________


-- 



*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*
Aldo Gangemi
Research Scientist
Laboratory for Applied Ontology, ISTC-CNR
Institute of Cognitive Sciences and Technologies
(Laboratorio di Ontologia Applicata,
Istituto di Scienze e Tecnologie della Cognizione,
Consiglio Nazionale delle Ricerche)
Viale Marx 15, 00137
Roma Italy
+3906.86090249
+3906.824737 (fax)
mailto://a.gangemi@istc.cnr.it
mailto://gangemi@acm.org
http://www.loa-cnr.it
Received on Friday, 30 April 2004 18:33:54 UTC