RE: [WNET] RDFS for WordNet datamodel from Aldo Gangemi on 2004-07-13 (public-swbp-wg@w3.org from July 2004)

From: Aldo Gangemi <a.gangemi@istc.cnr.it>
Date: Tue, 13 Jul 2004 13:12:41 +0200
To: "McBride, Brian" <brian.mcbride@hp.com>, 'Dan Brickley' <danbri@w3.org>
Cc: 'SWBPD list' <public-swbp-wg@w3.org>, jjc@hplb.hpl.hp.com, schreiber@cs.vu.nl
Message-Id: <p06110421bd196ef9d7b3@[150.146.88.29]>
Hello Brian, I'm out for holiday, just a quick reply.

At 14:52 +0100 9-07-2004, McBride, Brian wrote:
>[...]
>>
>>  (1) Antonym
>>
>>  Antonym is a symmetric relation between synset senses. I.e., Wordnet
>>  assumes that a set of synonyms can all have a set of other synonyms
>>  as antonyms, e.g.:
>>
>>  hasAntonym(synset:{conspicuous,obvious},
>>  synset:{inconspicuous,invisible})
>
>Hmm, I don't think Princton's wordnet models antonyms that way, for the
>following reasons:
>
>1) From the Wordnet documentation:
>
>[[
>  ant(synset_id,w_num,synset_id,w_num).
>
>     The ant operator specifies antonymous word s. This is a lexical relation
>that holds for all syntactic categories. For each antonymous pair, both
>relations are listed (ie. each synset_id,w_num pair is both a source and
>target word.)
>]]
>
>This cleary states that the ant relation is between words (which I, perhaps
>confusingly, have been calling word senses) not between sysnsets.  If it
>were between synsets then the relation would be
>
>   ant(synset_id, synset_id).

All the better. I was looking at an interface for WordNet that does 
not take that into account, and messes it all up.

BTW, antonym is a relation between word senses, just because a 
quadruple is used for "ant": no "word sense" is indexed in WordNet, 
but word senses result from the composition of synsets and words. I 
am in favour of introducing word senses explicitly in the datamodel.


>2) From the Wordnet book [1] p49:
>
>[[
>The first question caused serious problems for Wordnet, which was initially
>conceived as using labeled pointers between synsets in order to express
>semantic relations between lexical concepts.  But it is not appropriate to
>introduce antonymy by labeled pointers between synsets, for example between
>{heavy, weighty, ponderous} and {light, weightless, airy}.  People who know
>English judge heavy/light to be antonyms but they pause and are puzzled when
>asked whether heavy/weightless or ponderous/airy are antonyms.  The concepts
>are opposed, but the word forms are not familiar antonym pairs.  Antonymy,
>like synonymy, is a semantic relation between word forms.
>]]
>
>If you are not persuaded by the above, maybe we need guidance from
>Christiane.

I am

>
>>
>>  indeed, the usual intuition of speakers would accept only:
>>
>>  hasAntonym(conspicuous, inconspicuous)
>>
>>  while
>>
>>  hasAntonym(obvious, invisible)
>>
>>  seems probably less natural.
>
>Just so.
>
>>
>>  This is one reason to keep both synset senses and word senses.
>
>I agree.

Indeed

>  >
>>  (2) seeAlso
>>
>>  seeAlso is a kind of "very similar to" relation between synset
>>  senses,
>
>Here, I disagree for the same reasons.  From the wordnet documentation
>
>[[
>  sa(synset_id,w_num,synset_id,w_num).

Again, all the better.

>     The sa operator specifies that additional information about the first
>word can be obtained by seeing the second word. This operator is only
>defined for verbs and adjectives. There is no reflexive relation (ie. it
>cannot be inferred that the additional information about the second word can
>be obtained from the first word).
>]]
>
>[...]

As a matter of fact, all the examples I have seen, are symmetric. 
Since WordNet has no inference engine, symmetry must be explicitly 
declared for all pairs and their inverses. Have you found 
documentation excluding symmetry *programmatically*?

BTW, not reflexive!, reflexivity means that a relation applies to a 
same individual in both argument positions: Rel(x,x), while symmetry 
means that a relation holds always for its inverse as well: Rel(x,y) 
<-> Rel(y,x).

>  > >
>>  >I took that to mean that the antonym relation is between
>>  word senses, not
>>  >between words.  If we make it between words, we lose
>>  information represented
>>  >in WordNet.
>>
>>  It is between synset senses, not word senses,
>
>This is false.  See above.

As I said, all the better. I was misguided by that interface.

>  >unless we make a
>>  further assumption (see above). Where have you conceived of antonym
>>  as a relation between words?
>
>In the wordnet documentation and book.
>
>  it is not in the rdfs datamodel that I
>>  have proposed.
>
>That is correct, but it was in the RDFS that I proposed.  You removed the
>notion of WordSense from my proposal and I don't understand why.

False. I haven't removed it at all! please look at it more closer.
I've suggested to remove the subkinds of wors senses for adjectives, 
verbs, etc., because they can be inferred by the POS. they can be 
kept btw, even if they are redundant.

>  >
>>  >Another question is whether we need a resource node in the graph to
>>  >represent a word, or whether we can just use literals.  If I recall
>>  >correctly, my colleague inserted a resource so that he could
>>  model the
>>  >probablility that a particular word was used in a particular
>>  sense.  I think
>>  >he did this by creating a tertiary relation (word,
>>  wordsense, p) where p is
>>  >the probability that a the word is used in that sense. 
>>  There may be a
>>  >simpler way to do that, e.g. just hanging a single property
>>  of a wordsense
>>  >resource.
>>
>>  OK, but still you need a probability. It seems that the order of
>>  senses respects such a probability estimation.
>
>Good point!  I think we have lost that ordering in the current proposal.  Do
>we need to retain it?

We have to check as I said. If it is used sensibly, I suggest to retain it.

>  > We should also investigate if the most frequently used words *for a
>>  synset* correspond to the order of words given in the database.
>>
>>  >Another issue here is language.  Is the French word "chat"
>>  the same word as
>>  >the English word "chat".  We could still use literals to
>>  represent language,
>>  >but, if we want to use XML literals, then we'd have to wrap
>>  the literal in
>>  >an exlicit tag, e.g. "<word xml:lang="en">chat</word>"
>>  rather than just
>>  >"chat".  It might be simpler just to hang a langauge
>>  property off the Word
>>  >resource.
>>  >
>>
>>  a) If words are encoded in the English WordNet namespace, no word
>>  used for a French synset in a French WordNet can be confused with the
>>  first.
>
>I am not sure what you mean by word here.  I think you probably mean what I
>mean by wordsense.

No, I mean just word.

>Encoding information in URLs like that is a bit dodgy, for example, if I
>have a merged graph containing English and French wordnets, how would you
>encode a query to find the the synsets for the English word 'dog' with this
>method.  Its cleaner to have this information explictly represented in the
>graph structure.
>
>>
>>  b) If words are not encoded in a specific namespace, then a language
>>  property must be added.
>>
>>  What do we choose?
>>
>>  If we want to adhere as much as possible to the original WordNet
>>  datamodel, (a) is the winner.
>>  On the other hand, (b) allows reuse of the same string or literal for
>>  different languages, so that we will only have the resource "chat",
>>  independently of any particular language.
>>
>>    In general, I think (a) is more elegant, and its cost is probably
>>  less that the expected benefits.
>
>I tend to agree.
>
>Brian
>

All right, read you soon
Ciao
Aldo

-- 
Aldo Gangemi
Research Scientist
Laboratory for Applied Ontology
Institute for Cognitive Sciences and Technology
National Research Council (ISTC-CNR)
Via Nomentana 56, 00161, Roma, Italy
Tel: +390644161535
Fax: +3906824737
a.gangemi@istc.cnr.it
Received on Tuesday, 13 July 2004 07:24:44 UTC