- From: John McCrae <jmccrae@cit-ec.uni-bielefeld.de>
- Date: Fri, 19 Apr 2013 11:00:20 +0200
- To: Francis Bond <fcbond@gmail.com>
- Cc: Armando Stellato <stellato@info.uniroma2.it>, Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>, public-ontolex <public-ontolex@w3.org>
- Message-ID: <CAC5njqpYzXpc8TS6HqrFsEwBxMmfpwSxwCyCwAjDv4q+5Xk3hA@mail.gmail.com>
Hi Francis, Thanks for your interest. Firstly, in the original example language is marked on the entry, it could also be attached to be the sense http://www.w3.org/community/ontolex/wiki/Specification_of_Requirements/Linked_Data#Example:_WordNet_as_lemon-SKOS Secondly, from the point of view of OntoLex we use URIs as identifiers, which are essentially physical objects (referring to a file on some server) so we cannot *mandate* the use of a particular naming scheme. However, we can *recommend* the use of a particular scheme and we will certainly take the Kyoto scheme into account. As for multilinguality of synsets, it is a good question, I think you explain it well: as I see it lexical entries and senses are language-specific, ontology entities are not and synsets we will stay ambivalent about. Finally, I am aware that there is a distinction between hyponyms and instance hyponyms in WordNet and I already use different properties to represent these. Regards, John On Fri, Apr 19, 2013 at 3:44 AM, Francis Bond <fcbond@gmail.com> wrote: > G'day, > > just a couple of small points about recent wordnet advances. > > The first is that there are now many more wordnets than just the Princeton > WordNet of English (http://www.casta-net.jp/~kuribayashi/multi/), so > things like sense and word must, of course, be labeled with the languages > (I suspect this was just omitted for space). > > The second is that, there may be different versions of the same wordnet, > so it we need to label the wordnet. The convention in wordnet-LMF is to > use identifiers of the form LLL-VV-OOOOOOOOO-P where LLL is the language, > VV is the version, OOOOOOOO is the offset and P is the part of speech. So: > instead of syn_n_08225481 people use: eng-30-08225481-n. If we could > adopt the same convention it would make interoperability a little bit > easier. > > > http://kyoto-project.eu/xmlgroup.iit.cnr.it/kyoto/index6bfa.html?option=com_content&view=article&id=143&Itemid=129 > > Debate still rages over whether synsets can be/should be shared between > languages or not. I think that they can if we are careful, especially at > the level of granualarity we use in practice, but it is still an open > question. If we think that they are not, then a single lexical concept may > be a supertype of multiple synsets from different languages: the synset > with 'dog' in English, the one with '犬' in Japanese, the one with 'anjing' > in Malaysian and so on. > > The final point is that the current English wordnets (and most recent > wordnets) distinguishes between hyponym and instance: > <<fictional character>> is a hyponym of <<imaginary being>> > <<Sherlock Holmes>> is an instance of <<fictional character>> > > I suspect we should try to capture this distinction. > > Orthogonally, I must admit to not being clear about the actual > implications of choosing the different names/models proposed in the > discussion, so find it hard to judge which is better --- if someone could > try to summarize this it would be really helpful to me, and maybe to others. > Yours, > > > -- > Francis Bond <http://www3.ntu.edu.sg/home/fcbond/> > Division of Linguistics and Multilingual Studies > Nanyang Technological University >
Received on Friday, 19 April 2013 09:00:57 UTC