RE: [WNET, PORT, OEP] Synset's, Classes and multiple languages from Miles, AJ (Alistair) on 2004-05-06 (public-swbp-wg@w3.org from May 2004)

From: Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk>
Date: Thu, 6 May 2004 15:00:02 +0100
To: "'public-swbp-wg@w3.org'" <public-swbp-wg@w3.org>
Message-ID: <350DC7048372D31197F200902773DF4C04944283@exchange11.rl.ac.uk>
Maybe of value here, reports on multilingual thesauri and thesaurus mapping
from SWAD-E:

[1] RDF Encoding of Multilingual Thesauri
<http://www.w3c.rl.ac.uk/SWAD/deliverables/8.3.html>
[2] Inter-Thesaurus Mapping
<http://www.w3c.rl.ac.uk/SWAD/deliverables/8.4.html>

[1] Outlines two approaches for thesaurus concepts - the scruffier but
simpler 'multilingual labelling' approach allows e.g. both 'dog' and 'chien'
to be labels for the same concept node (with the languages given by the
literal language tag), vs. the 'interlingual mapping' approach where you try
to express the extent to which the concepts from one monolingual thesaurus
overlap with the concepts from another in a different language.

Al.

> -----Original Message-----
> From: public-swbp-wg-request@w3.org
> [mailto:public-swbp-wg-request@w3.org]On Behalf Of Dan Brickley
> Sent: 06 May 2004 09:58
> To: McBride, Brian
> Cc: Aldo Gangemi; SWBPD list
> Subject: Re: [WNET, PORT, OEP] Synset's, Classes and multiple 
> languages
> 
> 
> 
> McBride, Brian wrote:
> 
> > Thinking a little more about this as I cycled to work this 
> morning, it
> > occurred to me that one test to apply to a design for a 
> wordnet mapping is
> > mergability with other WordNet's, and other ontologies, 
> which leads to a
> > specific test case (I love test cases).
> > 
> > What happens if we merged say a French and English WordNet?
> 
> This is a really interesting problem space. One can easily get 
> distracted by grand issues such as the relationship between 
> thought and 
> language...
> 
> Eurowordnet had a good crack at the practicalities, see 
> http://www.illc.uva.nl/EuroWordNet/ for docs (sadly not data) 
> explaining 
>   their twist on the Princeton wordnet approach. They had to 
> change it 
> slightly to give them a model for representing multi-lingual wordnets 
> (at least for Euro languages, not sure if they claim universality).
> 
> > If we have the synset "dog" sameClassAs the class of dogs, 
> and the synset
> > "chien" sameClassAs the class of dogs, then we have synset 
> "dog" sameClassAs
> > synset "chien".  My  knowledge of Owl is too weak :(  Does 
> that lead to the
> > conclusion that "chien" and "dog" are members of the same 
> synset?  Would
> > that be a problem for anyone?
> 
> http://www.w3.org/TR/2004/REC-owl-ref-20040210/ reminds me that 
> daml:sameClassAs became owl:equivalentClass ie 
> http://www.w3.org/TR/2004/REC-owl-ref-20040210/#equivalentClass-def
> 
>    The meaning of such a class axiom is that the two class 
> descriptions
>    involved have the same class extension (i.e., both class extensions
>    contain exactly the same set of individuals).
> 
> Remembering that OWL and RDF/S allow for there to be distinct classes 
> that have a common extension, this allows us to have classes 
> "Dog" and 
> "Chien" which are co-extensive, yet nevertheless distinct (including 
> their rdfs:label, rdfs:comment, URI names, etc.).
> 
> (I think in DAML+OIL, the claim might have been stronger, ie. 
> that the 
> two class descriptions were descriptions of the self-same class.)
> 
> FWIW I also see value in creating distinct RDF/OWL classes 
> for each term 
> in a wordnet synset, since there are plentiful subtle differences 
> between the terms that Wordnet lumps together. For eg see 
> http://rdfweb.org/topic/WhyWordnetIsCool for a few examples 
> (I applied 
> Wordnet to some photos I took one weekend). Wordnet has "cowboy hat, 
> ten-gallon hat" as synonyms. Also "can, tin, tin can". 
> Wordnet is pretty 
> scruffy. By breaking out synset parts into their own classes, in 
> addition to having a common superclass they all share, we 
> allow people 
> to use wordnet in a more precise way, without having to spent the 
> time/money cleaning it up into a formal ontology.
> 
> > If they are synonyms, we'd need to encode the language 
> somewhere, so that
> > software can easily restrict the language of the synoyms it selects.
> 
> This is the tricky problem of trying to a map a 
> linguistically oriented 
> system into a world-oriented system. I think we need parallel 
> representations of wordnets: a (simplified) class hierarchy (ie. 
> ignoring many parts of wordnet), and a full wordnet-style web of 
> (linguistic) concepts. The former would benefit for exposing RDF/OWL 
> classes for each synset, and each term in the synset; the 
> latter might 
> be handled with an extension of SKOS, although that is yet to be 
> established.
> 
> > I guess I'm wondering if it would make sense to model the 
> synsets for the
> > same concept in different languages as different resources. 
> 
> I think so, at some level. At another level, we want to express the 
> commonality.
> 
> Motivating use case: Image description... many electronic images are 
> language-neutral and hence internationally useful in a way that 
> textually-oriented electronic documents aren't. Being able to have an 
> image be described by a french speaker, yet discovered by a Japanese 
> speaker (or vice-versa), is a nicely practical goal for the 
> semantic web 
> effort.
> 
> cheers,
> 
> Dan
> 
> ps. 
> http://www.w3.org/2001/sw/Europe/200404/i18n/jptofu-example1.xml for 
> some rough experiments with Japanese dictionary ideas...
> 
> 
>
Received on Thursday, 6 May 2004 10:00:59 UTC