Re: Appendix H: Internationalization from Valeria de Paiva on 2012-05-15 (public-swbp-wg@w3.org from May 2012)

From: Valeria de Paiva <valeria.depaiva@gmail.com>
Date: Mon, 14 May 2012 22:03:05 -0700
To: Chris Welty <cawelty@gmail.com>
Cc: Alexandre Rademaker <arademaker@gmail.com>, public-swbp-wg@w3.org, Gerard de Melo <demelo@icsi.berkeley.edu>
Message-ID: <CAESt=XtzLGVjAG+jg2wmJ8VBMdf7H-+ihVh=_3PvKzm3pY=+yg@mail.gmail.com>
Dear Chris,
Thank you for the interesting discussion. (I am another one of the people
working with Alexandre, actually I guess I am the originator of the
project, since I wanted to reproduce for Portuguese the work using the
Bridge system that I worked on with Danny Bobrow, Ron Kaplan and the whole
team of NLTT at PARC).

 indeed, while the binary classification of members of synsets can seem
very coarse at times, we feel that  a first approximation to a resource
just like the original Princeton WordNet for Portuguese would be very
useful.  bootstrapping it from a translation from English seems the quick
way to go about it, but it also leads into the difficult stuff you mention.

My take is  that we should try to produce a first version of something a
bit like a Portuguese version of WordNet, but that then, having a coarse
approximation, we need to get Brazilian lexicographers to do their  work.
whether we can get them interested or not, we don't know yet...

Best regards,
Valeria

On Mon, May 14, 2012 at 6:19 AM, Chris Welty <cawelty@gmail.com> wrote:

>
> Alexandre,
>
> One criticism of Wordnet synsets is that there is a binary classification
> that must happen, each word must either be a member of a synset or not.  In
> reality, there is really a sort of degree to which a word may belong to a
> synset, and this may be useful to capture especially when translating.
>
> One example is "to know" in English and "savoir" vs. "connaitre" in
> french.  In basic French, we learn that Savoir is to know something, and
> connaitre is to know a person.  We were taught that what in english seems
> to be a single sense in french is two senses.
>
> If English Wordnet had been constructed without knowledge of this
> distinction, there would be only one sense of "to know", which would then
> be translatable to two synsets in french, you would need to understand in
> this mapping that it is incomplete.
>
> In gets more complicated when you realize that what we learned in basic
> french is not completely true, while we use the word "know" in English for
> knowing people, the best translation from french for "connaitre" is "to be
> familiar with".  Indeed, French uses the word that way - you can reconnais
> a place, a store, etc., it turns out to be something of a historical
> artifact that (american) English uses "to know" in this case more commonly.
>  But "familiar" do not belong to this (English) synset as strongly as
> "know" - it belongs, and would be understood, but based on the frequency of
> usage it would sound a little archaic and formal to use "familiar" instead
> of "know" for a person.
>
> So, the point is, how can you capture this fact that subtleties of
> language can create partial mappings between them.
>
> This is often easier to explain when you use something that has a
> scientific understanding as a range of values, like colors.  Take the
> english word "maroon", which is a color that lies somewhere on the spectrum
> between red and purple.  Would you lump this into the synset for red, or
> for purple?   Where do you draw the line in that synset, at a particular
> point in the spectrum?  What if you found that different languages and
> cultures draw their boundaries differently, like maybe Italians "see" red
> as a darker color that Germans, and the mapping of "maroon" into these
> languages is partial.
>
> Does that make 'sense' ;) ?
>
> -Chris
>
>
> On 5/10/2012 4:57 PM, Alexandre Rademaker wrote:
>
>> I am about to finish the translation of our OpenWordNet-PT to RDF
>> integrating it with the original Princeton WordNet 3.0.
>>
>> In appendix H of http://www.w3.org/TR/wordnet-**rdf/<http://www.w3.org/TR/wordnet-rdf/>
>> :
>>
>> "... Integration of WordNets implies creating mappings between
>> entities in the WordNets to indicate lexico-semantic relationships
>> between them, e.g. a property that signifies that the meanings of two
>> Synsets overlap. The entities that represent language concepts that
>> should be able to map are instances of the classes: Synset, WordSense
>> and Word..."
>>
>> I can easily see the utility of an relation between Synsets and
>> WordSenses like "hasTranslation". But I can't see any use of relate
>> the words... Any idea?
>>
>> Best,
>>
>> Alexandre Rademaker
>> http://arademaker.github.com/
>>
>>
>>
>>


-- 
Valeria de Paiva
http://www.cs.bham.ac.uk/~vdp/
http://valeriadepaiva.org/www/
Received on Tuesday, 15 May 2012 07:59:54 UTC