- From: Philippe Martin <phmartin@meganesia.sci.griffith.edu.au>
- Date: Fri, 16 Apr 2004 09:34:46 -0400 (EDT)
- To: public-swbp-wg@w3.org
- Cc: phmartin@meganesia.int.gu.edu.au
Aldo, Jeremy, > 2) Clarification of lexical relations, ... > 3) How to refine wordnets' hierarchies? ... > an OWL ontology to talk about the main wordnet relationships Here are some kinds of "links" (relations between categories, i.e. between types or indivisuals) that I had to introduce when converting the noun-related part of WordNet 1.7 into a genuine lexical ontology usable for knowledge-based management purposes (details at http://www.webkb.org/doc/wn/): - to replace the hyponym links: subtypeOf and instanceOf; - to replace the antonym links: simpleExclusionOf and closedExclusionOf (like simpleExclusionOf but in addition, the linked categories form a "closed subtype partition" for another type, e.g. wn#chromatic_color and wn#achromatic_color form a closed subtype partition for wn#color); the complementOf and inverseOf links are not currently useful within WordNet but are common (e.g. they exist in OWL); - to replace certain erroneous uses of the hyponym link: the locationOf link (e.g. wn#Dunkerque is said to be hyponym of both wn#city and wn#amphibious_assault; this violates some exclusionOf link(s) in my top-level ontology; I have set wn#Dunkerque as instanceOf wn#city and locationOf wn#amphibious_assault). Other kinds of links are useful to extend WordNet: instrumentOf, inputOf, objectOf, agentOf, resultOf, etc. > > > But also look at the file at http://xmlns.com/wordnet/1.6/City: > > > City is a class introduced with all its taxonomic branch (poor > > > practice: if each class is introduced with all its superclasses, > > > the ontology results unnecessary long) then all hyponyms of "City" > > > are introduced, ... > > If all you want to know about is City then the City download is a > > good one, if you want to know about more than that, maybe you need > > the full download (wherever that is). ... > Got the point, it is an efficiency issue. Is there any tool to > convert an OWL (or RDF) ontology into a set of "views files", possibly > based on customizable properties (e.g., "give me only a superclass > specification view", or "give me subclasses and related classes > specification view"? > I think this is a very promising area, and could perhaps be > generalized into a system that was given the URI of a word or > document or a Literal (i.e. a node or small graph) and returned > a graph containing related nodes. > (Jena + Lucene might provide a lot of of what's required) Indeed, storing a "view file" for each WordNet category and each exploration/presentation option of its relation to other categories is not practical. A knowledge server is needed. To illustrate some options, I'll use my own system (WebKB-2) as an example. http://www.webkb.org/bin/categSearch.cgi?categ=%23city&format=RDF Value for the "categ" parameter: either a category identifier such as "#city" (GET encoding: %23city; "wn#city" and "#city" are equivalent in WebKB-2: WordNet is the default source/creator) or a word such as "city" which may have more than one recorded meaning (in which case, more than one "category and associated relations" is presented). Value for the "format" parameter: there should be more than one possible format because (i) RDF/XML is inadequate for vizualizing or navigating a large semantic network, (ii) RDF/XML+OWL is inadequate for many other purposes (natural language representation, interlingua, development and representation of reasonably complex ontologies, etc.). WebKB-2 currently proposes four other formats for presenting links between categories and navigating them. For example, try: http://www.webkb.org/bin/categSearch.cgi?categ=city&recursLink=%3C&hyperlinks ("%3C" is the GET encoding for '<', i.e. the subtypeOf link; thus, "recursLink=%3C" asks for the display of the whole subtypeOf hierarchy, unless other presentation constraints limit that exploration, e.g. an exploration depth). See the "category search" interface for more options: http://www.webkb.org/interface/categSearch.html > For example, anyone creating an OWL or RDFS class might wish to > annotate it with its intended meaning using *the* URI for a specific > sense of an English word, as classified by wordnet. The main > requirement from this use case is agreement over what that URI is, > including the beginning bit (the namespace) and the end bit (the > mapping from Wordnet's representation of senses) ... There will be more than one possible URI for a category even if only one knowledge server is exploited: - if URLs such as http://www.webkb.org/bin/categSearch.cgi?categ=%23city are used, many options may be added (hence leading to different URLs); if artificial URIs such as http://www.webkb.org/wn#city are used (this URI does not refer to any existing document), the problem for knowledge processing tools is then to find the URL of the knowledge server to get more information; - a category may have many identifiers, e.g. in WebKB-2 the following are equivalent: #city, wn#city, #city__metropolis__urban_center; - a category may be linked to other categories by equivalentTo/equalTo links, e.g. wn#city = pm#city = some_french_ontology#ville; since these categories are said to be equivalent/identical, asking for links connected to one of them should lead to the same result as asking for links connected to any other one of them. This is not a problem within a knowledge server (e.g. WebKB-2) but if more than one knowledge server is used, there must be some regular mirroring/replication/propagation processes between the servers. > 7) Wordnets' modularization: how to organize the global wordnet > namespace into domains of interest (directory-like subject trees) or > in formal theories (ontology modules). What semantics should be used? > 8) Enrichment to domains, possibly in an open-source programme: > wordnets usually feature a poor support for specific domains, being > general purpose structures. How to expand them through open-source ... Indeed, most knowledge-based uses of WordNet will lead to additions (or sometimes corrections) and these extensions should be sharable. Independently developed extensions stored in static Web documents are difficult to merge (manually, and even more automatically) in a semantically/logically/ontologically correct way and hence hard to re-use for genuine knowledge based management purposes. In my opinion, a more scalable solution (for knowledge based management purposes and "Semantic Web" purposes) is the use of knowledge servers where each server - associates each user's addition (e.g. a category, a link between categories, a statement) with its source/creator (ontology/user), removes or warns about introduced redundancies, and rejects the addition if it introduces an inconsistency (with the protocols used in WebKB-2, this does not prevent a user to express her own beliefs but this lets her know about a conflict with other users' statements and leads her to be more precise/explicit to eliminate the problem; no discussion or agreement between the users is necessary); - periodically checks other servers (similar general servers, or servers specialized in the same domains) to complement its knowledge base (thus, it does not matter much which servers people use, and this is how the centralized and distributed approaches combine their advantages; of course, integrating knowledge from other servers may not be obvious but it is *much* easier than trying to re-use/integrate dozens/hundreds/thousands of poorly inter-connected extensions). More details on this are accessible from http://www.webkb.org/doc/papers/wi02/ I do not think that people should have to explore a hierarchy of ontologies (modules) or "domains of interest" to decide which modules to search, re-use, extend or merge, especially since these modules may be conflicting, redundant and complementary. Imagine someone wanting to search or add some knowledge about certain kinds of "neurons", or "cats" or "feet". There are (too) many domains this knowledge can be categorized into and (too) many tasks it can be useful for. People should be able to use a general knowledge server to find the different meanings of (categories for) "neurons", "cats" or "feet", and explore their specializations (or other related categories) until they have found the right category (or the closest). Ideally, the knowledge statements (or the "data" for a database related analogy) associated to a category should also be organized via categories, e.g. a category #Siamese_cat might be linked to a category pm#health_issues_of_siamese_cats. If the URL of a specialized knowledge server is associated to a sufficiently adequate category (e.g. via a link "serverFor"), the user should continue its exploration on this specialized server (if there are several possible specialized servers, ideally they mirror each other to share the same knowledge about this category, i.e. the same collection of statements related to that category). When the creators of a specialized server register to have their server associated to a category in a general server, they commit to try to collect and structure as much knowledge as possible about that category (e.g. about pm#health_issues_of_siamese_cats, or more ambitiously, about #Siamese_cat; the server about #Siamese_cat may of course refer to the server about pm#health_issues_of_siamese_cats). (In a commercial environment, it is in the interest of the creators of a server to do some checking and refer only to the most comprehensive of more specialized servers, hence easing users' information retrieval). In addition to permit search by navigation, a server may also permit querying and hence may exploit more specialized servers for that. I should also note that since in a knowledge server the source/creator of each piece of knowledge is stored, knowledge modules can be re-generated. Actually, arbitrary complex queries on the knowledge (its content plus its creators and their characteristics) can be used to generate knowledge modules. However, it is easier and better to make additions to the knowledge base of a knowledge server by creating a static module (a Web document), submitting it to the server and refining it until its content is stable and accepted by the server (and commited into its knowledge base) than trying to do incremental on-line additions and removals (an analogy of that would be trying to develop a large shell script via the shell command line, i.e. without text editor). Furthermore, the links to those static modules may serve various purposes: documentation of the knowledge, regeneration of the knowledge base in case it gets corrupted, etc. Thus, modules (distributed static files) are useful but their content should be developped and exploited via knowledge servers (i.e. some means of centralization) to permit knowledge sharing/re-use. Philippe ______________________________________________________________________ Dr. Philippe Martin Address: Griffith Uni, School of I.T., PMB 50 GCMC, QLD 9726 Australia Email: phmartin@gu.edu.au; Fax: +61 7 5552 8066 ______________________________________________________________________
Received on Friday, 25 June 2004 10:22:29 UTC