W3C home > Mailing lists > Public > public-ontolex@w3.org > April 2013

Re: WordNet modelling in Lemon and SKOS

From: John McCrae <jmccrae@cit-ec.uni-bielefeld.de>
Date: Fri, 19 Apr 2013 11:00:20 +0200
Message-ID: <CAC5njqpYzXpc8TS6HqrFsEwBxMmfpwSxwCyCwAjDv4q+5Xk3hA@mail.gmail.com>
To: Francis Bond <fcbond@gmail.com>
Cc: Armando Stellato <stellato@info.uniroma2.it>, Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>, public-ontolex <public-ontolex@w3.org>
Hi Francis,

Thanks for your interest.

Firstly, in the original example language is marked on the entry, it could
also be attached to be the sense


Secondly, from the point of view of OntoLex we use URIs as identifiers,
which are essentially physical objects (referring to a file on some server)
so we cannot *mandate* the use of a particular naming scheme. However, we
can *recommend* the use of a particular scheme and we will certainly take
the Kyoto scheme into account.

As for multilinguality of synsets, it is a good question, I think you
explain it well: as I see it lexical entries and senses are
language-specific, ontology entities are not and synsets we will stay
ambivalent about.

Finally, I am aware that there is a distinction between hyponyms and
instance hyponyms in WordNet and I already use different properties to
represent these.


On Fri, Apr 19, 2013 at 3:44 AM, Francis Bond <fcbond@gmail.com> wrote:

> G'day,
> just a couple of small points about recent wordnet advances.
> The first is that there are now many more wordnets than just the Princeton
> WordNet of English (http://www.casta-net.jp/~kuribayashi/multi/), so
> things like sense and word must, of course, be labeled with the languages
> (I suspect this was just omitted for space).
> The second is that, there may be different versions of the same wordnet,
> so it we need to label the wordnet.   The convention in wordnet-LMF is to
> use identifiers of the form LLL-VV-OOOOOOOOO-P where LLL is the language,
> VV is the version, OOOOOOOO is the offset and P is the part of speech.  So:
> instead of syn_n_08225481 people use: eng-30-08225481-n.  If we could
> adopt the same convention it would make interoperability a little bit
> easier.
> http://kyoto-project.eu/xmlgroup.iit.cnr.it/kyoto/index6bfa.html?option=com_content&view=article&id=143&Itemid=129
> Debate still rages over whether synsets can be/should be shared between
> languages or not.  I think that they can if we are careful, especially at
> the level of granualarity we use in practice, but it is still an open
> question.  If we think that they are not, then a single lexical concept may
> be a supertype of  multiple synsets from different languages: the synset
> with 'dog' in English, the one with '犬' in Japanese, the one with 'anjing'
> in Malaysian and so on.
> The final point is that the current English wordnets (and most recent
> wordnets) distinguishes between hyponym and instance:
> <<fictional character>> is a hyponym of <<imaginary being>>
> <<Sherlock Holmes>> is an instance of <<fictional character>>
> I suspect we should try to capture this distinction.
> Orthogonally, I must admit to not being clear about the actual
> implications of choosing the different names/models proposed in the
> discussion, so find it hard to judge which is better  --- if someone could
> try to summarize this it would be really helpful to me, and maybe to others.
> Yours,
> --
> Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
> Division of Linguistics and Multilingual Studies
> Nanyang Technological University
Received on Friday, 19 April 2013 09:00:57 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:36:30 UTC