Re: Appendix H: Internationalization

Alexandre,

One criticism of Wordnet synsets is that there is a binary classification that must happen, each word must either be a member of a synset or not.  In reality, there is really a sort of degree to which a word may belong to a synset, and this may be useful to capture especially when translating.

One example is "to know" in English and "savoir" vs. "connaitre" in french.  In basic French, we learn that Savoir is to know something, and connaitre is to know a person.  We were taught that what in english seems to be a single sense in french is two senses.

If English Wordnet had been constructed without knowledge of this distinction, there would be only one sense of "to know", which would then be translatable to two synsets in french, you would need to understand in this mapping that it is incomplete.

In gets more complicated when you realize that what we learned in basic french is not completely true, while we use the word "know" in English for knowing people, the best translation from french for "connaitre" is "to be familiar with".  Indeed, French uses the word that way - you can reconnais a place, a store, etc., it turns out to be something of a historical artifact that (american) English uses "to know" in this case more commonly.  But "familiar" do not belong to this (English) synset as strongly as "know" - it belongs, and would be understood, but based on the frequency of usage it would sound a little archaic and formal to use "familiar" instead of "know" for a person.

So, the point is, how can you capture this fact that subtleties of language can create partial mappings between them.

This is often easier to explain when you use something that has a scientific understanding as a range of values, like colors.  Take the english word "maroon", which is a color that lies somewhere on the spectrum between red and purple.  Would you lump this into the synset for red, or for purple?   Where do you draw the line in that synset, at a particular point in the spectrum?  What if you found that different languages and cultures draw their boundaries differently, like maybe Italians "see" red as a darker color that Germans, and the mapping of "maroon" into these languages is partial.

Does that make 'sense' ;) ?

-Chris

On 5/10/2012 4:57 PM, Alexandre Rademaker wrote:
> I am about to finish the translation of our OpenWordNet-PT to RDF
> integrating it with the original Princeton WordNet 3.0.
>
> In appendix H of http://www.w3.org/TR/wordnet-rdf/:
>
> "... Integration of WordNets implies creating mappings between
> entities in the WordNets to indicate lexico-semantic relationships
> between them, e.g. a property that signifies that the meanings of two
> Synsets overlap. The entities that represent language concepts that
> should be able to map are instances of the classes: Synset, WordSense
> and Word..."
>
> I can easily see the utility of an relation between Synsets and
> WordSenses like "hasTranslation". But I can't see any use of relate
> the words... Any idea?
>
> Best,
>
> Alexandre Rademaker
> http://arademaker.github.com/
>
>
>

Received on Monday, 14 May 2012 13:19:53 UTC