- From: Aldo Gangemi <a.gangemi@istc.cnr.it>
- Date: Sat, 1 May 2004 00:05:51 +0100
- To: public-swbp-wg@w3.org
- Cc: phmartin@meganesia.sci.griffith.edu.au
As promised at the last telecon, I am forwarding the message I have received from Philippe Martin (one of the WordNet experts the I have contacted on behalf of the WordNet Task Force), who comments on the issues raised so far. Aldo >Date: Mon, 19 Apr 2004 14:34:34 +1000 >From: Philippe Martin <phmartin@meganesia.sci.griffith.edu.au> >Subject: Re: W3C Task Force on Porting Wordnet to the Semantic Web >To: Aldo Gangemi <a.gangemi@istc.cnr.it> >Cc: stefan@isi.edu, Kilian.Stoffel@unine.ch, mike@daconta.net, > Piek.Vossen@irion.nl, massi@cog.brown.edu, paulb@dfki.de, > phmartin@meganesia.sci.griffith.edu.au, moldovan@utdallas.edu, > velardi@dsi.uniroma1.it, jc@hplb.hpl.hp.com, schreiber@cs.vu.nl, > danbri@w3.org, Libby.Miller@bristol.ac.uk, brian.mcbride@hp.com, > danny666@virgilio.it, welty@us.ibm.com, fellbaum@clarity.princeton.edu, > oltramari@loa-cnr.it, phmartin@meganesia.int.gu.edu.au > > >Aldo, > <snip> >I have some comments on some questions/notes from >http://lists.w3.org/Archives/Public/public-swbp-wg/2004JanMar/thread.html#114 >I tried to mail these comments to the public-swbp-wg@w3.org list on >Friday but, although the mail server made me fill a form and then claimed to >have distributed my message, this message does not appear in the archive >(http://lists.w3.org/Archives/Public/public-swbp-wg/2004Apr/index.html) >and hence I assume it has not been distributed. >I include my message below for you to forward if its content is adequate. > <snip> > >========================================================================== > > >> 2) Clarification of lexical relations, ... >> 3) How to refine wordnets' hierarchies? ... >> an OWL ontology to talk about the main wordnet relationships > >Here are some kinds of "links" (relations between categories, >i.e. between types or indivisuals) that I had to introduce when >converting the noun-related part of WordNet 1.7 into a genuine >lexical ontology usable for knowledge-based management purposes >(details at http://www.webkb.org/doc/wn/): >- to replace the hyponym links: subtypeOf and instanceOf; >- to replace the antonym links: simpleExclusionOf and > closedExclusionOf (like simpleExclusionOf but in addition, > the linked categories form a "closed subtype partition" for > another type, e.g. wn#chromatic_color and wn#achromatic_color > form a closed subtype partition for wn#color); > the complementOf and inverseOf links are not currently useful > within WordNet but are common (e.g. they exist in OWL); >- to replace certain erroneous uses of the hyponym link: the > locationOf link (e.g. wn#Dunkerque is said to be hyponym of both > wn#city and wn#amphibious_assault; this violates some exclusionOf > link(s) in my top-level ontology; I have set wn#Dunkerque as > instanceOf wn#city and locationOf wn#amphibious_assault). > >Other kinds of links are useful to extend WordNet: >instrumentOf, inputOf, objectOf, agentOf, resultOf, etc. > > > > >> > > But also look at the file at http://xmlns.com/wordnet/1.6/City: >> > > City is a class introduced with all its taxonomic branch (poor >> > > practice: if each class is introduced with all its superclasses, >> > > the ontology results unnecessary long) then all hyponyms of "City" >> > > are introduced, ... >> > If all you want to know about is City then the City download is a >> > good one, if you want to know about more than that, maybe you need >> > the full download (wherever that is). ... >> Got the point, it is an efficiency issue. Is there any tool to >> convert an OWL (or RDF) ontology into a set of "views files", possibly >> based on customizable properties (e.g., "give me only a superclass >> specification view", or "give me subclasses and related classes >> specification view"? > >> I think this is a very promising area, and could perhaps be >> generalized into a system that was given the URI of a word or >> document or a Literal (i.e. a node or small graph) and returned >> a graph containing related nodes. > > (Jena + Lucene might provide a lot of of what's required) > >Indeed, storing a "view file" for each WordNet category and each >exploration/presentation option of its relation to other categories >is not practical. A knowledge server is needed. To illustrate some >options, I'll use my own system (WebKB-2) as an example. > >http://www.webkb.org/bin/categSearch.cgi?categ=%23city&format=RDF > >Value for the "categ" parameter: either a category identifier such as >"#city" (GET encoding: %23city; "wn#city" and "#city" are equivalent >in WebKB-2: WordNet is the default source/creator) or a word such as >"city" which may have more than one recorded meaning (in which case, >more than one "category and associated relations" is presented). >Since generating short and intuitive identifiers for the WordNet >categories is the first (and most important) task for its re-use for >ontology-based applications, I am a bit surprised that this issue was >not brought up in the discussion at >http://lists.w3.org/Archives/Public/public-swbp-wg/2004JanMar/thread.html#114 > >Value for the "format" parameter: there should be more than one >possible format because (i) RDF/XML is inadequate for vizualizing or >navigating a large semantic network, (ii) RDF/XML+OWL is inadequate >for many other purposes (natural language representation, interlingua, >development and representation of reasonably complex ontologies, etc.). >WebKB-2 currently proposes four other formats for presenting links >between categories and navigating them. For example, try: >http://www.webkb.org/bin/categSearch.cgi?categ=city&recursLink=%3C&hyperlinks >("%3C" is the GET encoding for '<', i.e. the subtypeOf link; thus, >"recursLink=%3C" asks for the display of the whole subtypeOf hierarchy >unless other presentation constraints limit that exploration, e.g. an >exploration depth). See the "category search" interface for more >options: http://www.webkb.org/interface/categSearch.html > > > >> For example, anyone creating an OWL or RDFS class might wish to >> annotate it with its intended meaning using *the* URI for a specific >> sense of an English word, as classified by wordnet. The main >> requirement from this use case is agreement over what that URI is, >> including the beginning bit (the namespace) and the end bit (the >> mapping from Wordnet's representation of senses) ... > >There will be more than one possible URI for a category even if only one >knowledge server is exploited: >- if URLs such as http://www.webkb.org/bin/categSearch.cgi?categ=%23city > are used, many options may be added (hence leading to different URLs); > if artificial URIs such as http://www.webkb.org/wn#city are used (this > URI does not refer to any existing document), the problem for > knowledge processing tools is then to find the URL of the knowledge > server to get more information; >- a category may have many identifiers, e.g. in WebKB-2 the following > ones are equivalent: #city, wn#city, #city__metropolis__urban_center; >- a category may be linked to other categories by equivalentTo/equalTo > links, e.g. wn#city = pm#city = some_french_ontology#ville; since > these categories are said to be equivalent/identical, asking for links > connected to one of them should lead to the same result as asking for > links connected to any other one of them. This is not a problem > within a knowledge server (e.g. WebKB-2) but if more than one > knowledge server is used, there must be some regular > mirroring/replication/propagation processes between the servers. > > >> 7) Wordnets' modularization: how to organize the global wordnet >> namespace into domains of interest (directory-like subject trees) or >> in formal theories (ontology modules). What semantics should be used? >> 8) Enrichment to domains, possibly in an open-source programme: >> wordnets usually feature a poor support for specific domains, being >> general purpose structures. How to expand them through open-source ... > >Indeed, most knowledge-based uses of WordNet will lead to additions (or >sometimes corrections) and these extensions should be sharable. > >Independently developed extensions stored in static Web documents are >difficult to merge (manually, and even more automatically) in a >semantically/logically/ontologically correct way and hence >hard to re-use for genuine knowledge based management purposes. > >In my opinion, a more scalable solution (for knowledge based management >purposes and "Semantic Web" purposes) is the use of knowledge servers >where each server >- associates each user's addition (e.g. a category, a link between > categories, a statement) with its source/creator (ontology/user), > removes or warns about introduced redundancies, and > rejects the addition if it introduces an inconsistency (with the > protocols used in WebKB-2, this does not prevent a user to express > her own beliefs but this lets her know about a conflict with other > users' statements and leads her to be more precise/explicit to > eliminate the problem; no discussion or agreement between the users > is necessary); >- periodically checks other servers (similar general servers, or servers > specialized in the same domains) to complement its knowledge base > (thus, it does not matter much which servers people use, and this > is how the centralized and distributed approaches combine their > advantages; of course, integrating knowledge from other servers > may not be obvious but it is *much* easier than trying to > re-use/integrate dozens/hundreds/thousands of poorly inter-connected > extensions). >More details on this are accessible from >http://www.webkb.org/doc/papers/wi02/ > >I do not think that people should have to explore a hierarchy of >ontologies (modules) or "domains of interest" to decide which >modules to search, re-use, extend or merge, especially since >these modules may be conflicting, redundant and complementary. >Imagine someone wanting to search or add some knowledge about certain >kinds of "neurons", or "cats" or "feet". There are (too) many domains >this knowledge can be categorized into and (too) many tasks it can be >useful for. People should be able to use a general knowledge server to >find the different meanings of (categories for) "neurons", "cats" or >"feet", and explore their specializations (or other related categories) >until they have found the right category (or the closest). >Ideally, the knowledge statements (or the "data" for a database related >analogy) associated to a category should also be organized via >categories, e.g. a category #Siamese_cat might be linked to a >category pm#health_issues_of_siamese_cats. >If the URL of a specialized knowledge server is associated to a >sufficiently adequate category (e.g. via a link "serverFor"), the user >should continue its exploration on this specialized server (if there are >several possible specialized servers, ideally they mirror each other to >share the same knowledge about this category, i.e. the same collection >of statements related to that category). When the creators of a >specialized server register to have their server associated to a >category in a general server, they commit to try to collect and >structure as much knowledge as possible about that category (e.g. >about pm#health_issues_of_siamese_cats, or more ambitiously, about >#Siamese_cat; the server about #Siamese_cat may of course refer to the >server about pm#health_issues_of_siamese_cats). (In a commercial >environment, it is in the interest of the creators of a server to do >some checking and refer only to the most comprehensive of more >specialized servers, hence easing users' information retrieval). >In addition to permit search by navigation, a server may also permit >querying and hence may exploit more specialized servers for that. > >I should also note that since in a knowledge server the source/creator >of each piece of knowledge is stored, knowledge modules can be >re-generated. Actually, arbitrary complex queries on the knowledge >(its content plus its creators and their characteristics) can be used >to generate knowledge modules. However, it is easier and better to make >additions to the knowledge base of a knowledge server by creating a >static module (a Web document), submitting it to the server and refining >it until its content is stable and accepted by the server (and commited >into its knowledge base) than trying to do incremental on-line additions >and removals (a loose analogy of that would be trying to develop a large >shell script via the shell command line, i.e. without text editor). >Furthermore, the links to those static modules may serve various >purposes: documentation of the knowledge, regeneration of the knowledge >base in case it gets corrupted, etc. Thus, modules (distributed static >files) are useful but their content should be developped and exploited >via knowledge servers (i.e. some means of centralization) to permit >knowledge sharing/re-use. > > >Philippe > >______________________________________________________________________ >Dr. Philippe Martin >Address: Griffith Uni, School of I.T., PMB 50 GCMC, QLD 9726 Australia >Email: phmartin@gu.edu.au; Fax: +61 7 5552 8066 >______________________________________________________________________ -- *;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;*;* Aldo Gangemi Research Scientist Laboratory for Applied Ontology, ISTC-CNR Institute of Cognitive Sciences and Technologies (Laboratorio di Ontologia Applicata, Istituto di Scienze e Tecnologie della Cognizione, Consiglio Nazionale delle Ricerche) Viale Marx 15, 00137 Roma Italy +3906.86090249 +3906.824737 (fax) mailto://a.gangemi@istc.cnr.it mailto://gangemi@acm.org http://www.loa-cnr.it
Received on Friday, 30 April 2004 18:33:54 UTC