Re: [WNET] Fwd: Re: W3C Task Force on Porting Wordnet to the Semantic Web from Danny Ayers on 2004-05-02 (public-swbp-wg@w3.org from May 2004)

From: Danny Ayers <danny666@virgilio.it>
Date: Sun, 02 May 2004 12:51:14 +0200
To: Aldo Gangemi <a.gangemi@istc.cnr.it>
Cc: public-swbp-wg@w3.org, phmartin@meganesia.sci.griffith.edu.au
Message-ID: <4094D2A2.2050800@virgilio.it>
(I've snipped big chunks of Philippe's very interesting knowledge server 
ideas, trying to narrow to more immediate WordNet/SW mapping issues)

Philippe Martin:

>>>  2) Clarification of lexical relations, ...
>>>  3) How to refine wordnets' hierarchies? ...
>>>  an OWL ontology to talk about the main wordnet relationships
>>
>>
>> Here are some kinds of "links" (relations between categories,
>> i.e. between types or indivisuals) that I had to introduce when
>> converting the noun-related part of WordNet 1.7 into a genuine
>> lexical ontology usable for knowledge-based management purposes
>> (details at http://www.webkb.org/doc/wn/):
>> - to replace the hyponym links: subtypeOf and instanceOf;
>> - to replace the antonym links: simpleExclusionOf and
>>   closedExclusionOf (like simpleExclusionOf but in addition,
>>   the linked categories form a "closed subtype partition" for
>>   another type, e.g. wn#chromatic_color and wn#achromatic_color
>>   form a closed subtype partition for wn#color);
>>   the complementOf and inverseOf links are not currently useful
>>   within WordNet but are common (e.g. they exist in OWL);
>> - to replace certain erroneous uses of the hyponym link: the
>>   locationOf link (e.g. wn#Dunkerque is said to be hyponym of both
>>   wn#city and wn#amphibious_assault; this violates some exclusionOf
>>   link(s) in my top-level ontology; I have set wn#Dunkerque as
>>   instanceOf wn#city and locationOf wn#amphibious_assault).
>>
Ok, this suggests to me that a mapping is needed between the WN 
terminology and RDF/OWL constructs, and that some manual intervention 
will be needed after any automatic translation: so hyponym would appear 
to map mostly to rdfs:subClassOf but manual correction will be needed 
where the relationship is rdf:type.

It sounds like the antonym side could be tricky, again with something of 
a mixup between instances and classes - map mostly to owl:disjointWith 
then tidy exceptions to owl:differentFrom ??
I'm guessing the closure of "closed subtype partition" may be lost 
because of the open world assumption.

The Dunkerque case does suggest non-trivial manual intervention will be 
needed and/or a narrowing of the range of proper nouns!

>> Other kinds of links are useful to extend WordNet:
>> instrumentOf, inputOf, objectOf, agentOf, resultOf, etc.
>>
Hmm, it's interesting that these all seem process-oriented - perhaps 
terms could be borrowed from OWL-S?

>> [snip]
>>
>> Indeed, storing a "view file" for each WordNet category and each
>> exploration/presentation option of its relation to other categories
>> is not practical. A knowledge server is needed. 
>
So the relations/view has to be worked out at runtime?
In RDF/OWL terms I guess we'd be looking at a triplestore holding the WN 
model, with a query engine taking queries/returning views, perhaps with 
on-the-fly generation of inferred relationships (and almost certainly 
with caching).

>> [snip]
>>
>>
>>>  For example, anyone creating an OWL or RDFS class might wish to
>>>  annotate it with its intended meaning using *the* URI for a specific
>>>  sense of an  English word, as classified by wordnet. The main
>>>  requirement from this use case is agreement over what that URI is,
>>>  including the beginning bit (the namespace) and the end bit (the
>>>  mapping from Wordnet's representation of senses) ...
>>
>>
>> There will be more than one possible URI for a category even if only one
>> knowledge server is exploited:
>> - if URLs such as http://www.webkb.org/bin/categSearch.cgi?categ=%23city
>>   are used, many options may be added (hence leading to different URLs);
>>   if artificial URIs such as http://www.webkb.org/wn#city are used (this
>>   URI does not refer to any existing document), the problem for
>>   knowledge processing tools is then to find the URL of the knowledge
>>   server to get more information;
>> - a category may have many identifiers, e.g. in WebKB-2 the following
>>   ones are equivalent: #city, wn#city, #city__metropolis__urban_center;
>> - a category may be linked to other categories by equivalentTo/equalTo
>>   links, e.g. wn#city = pm#city = some_french_ontology#ville; since
>>   these categories are said to be equivalent/identical, asking for links
>>   connected to one of them should lead to the same result as asking for
>>   links connected to any other one of them. This is not a problem
>>   within a knowledge server (e.g. WebKB-2) but if more than one
>>   knowledge server is used, there must be some regular
>>   mirroring/replication/propagation processes between the servers.
>
I think the only way this could be scalable is to use a single namespace 
with 'official' identifiers for the words. It probably would be 
advantageous to have a http server available, I think danbri's pointed 
out a possible way forward there:

http://xmlns.com/wordnet/1.6/Dog

How other individual servers use the ontology would be left to 
individual implementations - something MGET-like might turn out to be a 
common setup:

http://example.org/query?wn_term=http://xmlns.com/wordnet/1.6/Dog

>>>  7) Wordnets' modularization: how to organize the global wordnet
>>>  namespace into domains of interest (directory-like subject trees) or
>>>  in formal theories (ontology modules). What semantics should be used?
>>

I suspect there needs to be a clean separation between the "core" 
representation (which presumably would be a tangly graph) and views that 
would help provide subject trees. I'm not sure how in-scope the latter 
are for the WN TF, though perhaps a good use cases would be to map WN to 
the dmoz.org open directory tree.

>>>  8) Enrichment to domains, possibly in an open-source programme:
>>>  wordnets usually feature a poor support for specific domains, being
>>>  general purpose structures. How to expand them through open-source ...
>>
>>
>> Indeed, most knowledge-based uses of WordNet will lead to additions (or
>> sometimes corrections) and these extensions should be sharable.
>
Hopefully corrections would be fed back into the dictionary, but 
certainly sharable additions offer a lot of promise. Discovery of the 
additions might be tricky, but that's a larger problem.

>>
>> Independently developed extensions stored in static Web documents are
>> difficult to merge (manually, and even more automatically) in a
>> semantically/logically/ontologically correct way and hence
>> hard to re-use for genuine knowledge based management purposes.
>
Hence RDF/OWL...

>>
>> [snip]
>
>>
>> More details on this are accessible from
>> http://www.webkb.org/doc/papers/wi02/
>
I certainly have nothing against the implementation you describe, but 
would suggest that the Semantic Web will be comprised of servers (and 
clients) of many different architectures and for many different 
purposes, in the same way that human-readable knowledge on the web is 
maintained in a myriad different ways. The key requirement is having 
common interchange languages (i.e. RDF and OWL).

I do however think that your server suggestions do prompt some questions 
that probably should be directly addressed, in particular how will a 
Semantic Web representation of a WordNet-like lexicon be useful? How 
will it be used?

As a first pass, two application areas spring to mind - information 
search and knowledge interchange. An example of the former would be an 
extention of a traditional dictionary or encyclopedia, in which the WN 
terms would be used to disambiguate details of the query and provide a 
route to discovery of data. Such an application could simply provide 
direct references to something like the Wikipedia, but I think the more 
interesting SW application would involve the incorporation of statements 
that provide context for the query - a holidaymaker enquiring about 
Italian trains would be seeking different information that a WW2 history 
student.
The knowledge interchange angle is likely to have wider impact - if two 
organizations wish to exchange information then the unambiguously 
defined terms based on the WN lexicon combined with RDF/OWL logic 
provides a common language through which to communicate. I think that's 
the underlying intention of most ontology/schema authors that refer to 
WN terms already, but it still isn't entirely clear how 
ontology-ontology mapping will be aided in practice by this usage. (Ok, 
there are probably quite a few papers I haven't yet read...;-)

Cheers,
Danny.

-- 
----
Raw
http://dannyayers.com
Received on Sunday, 2 May 2004 06:52:16 UTC