- From: John McCrae <jmccrae@cit-ec.uni-bielefeld.de>
- Date: Tue, 16 Apr 2013 12:42:20 +0200
- To: Piek Vossen <piek.vossen@vu.nl>
- Cc: Armando Stellato <stellato@info.uniroma2.it>, Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>, public-ontolex <public-ontolex@w3.org>, Jacco van Ossenbruggen <Jacco.van.Ossenbruggen@cwi.nl>
- Message-ID: <CAC5njqpV6a=gMFeqaJ+4QkG52AN57wPGw3-DeWvMrYdd48FBFg@mail.gmail.com>
Hi, I agree, WordNet uses specific sense indexes, distinct from the synset identifiers and I think we must therefore have a named URI for both the synset and the sense itself. Regards, John On Tue, Apr 16, 2013 at 12:38 PM, Piek Vossen <piek.vossen@vu.nl> wrote: > Dear all, > > I have been silent for a while cause I am/was too busy to keep track of > all this. However, I feel the need to jump in now. Perhaps you already > discussed this and my comments are not of any use. Sorry for raising it and > do not bother. > > We had many discussion in the GWA community about sense identifiers and > synset identifiers. The consensus is that we need both. For ontologically > minded people synset ids for concepts are enough. However, not only the > order of the senses is important (it often reflects frequency) but there > are also many relations in various wordnets that hold only between lexical > units (sense of a word that belong to different synsets): derivational > relations, metonymy, metaphor, specialization, generalization etc.. Another > point is that in WSD approach people use sense-groups (possibly based on > the previous relations). Sense-groups consist of sense identifiers rather > than synset identifiers. > > In addition to the concept to concept relations, we thus need identifiers > for sense relations. In the W3C RDF version of Wordnet, they made the > mistake to use only sense-keys to identify concepts. I hope here, you are > not making the reverse mistake to use only synset ids. > > best wishes > > Piek > > > On Apr 16, 2013, at 11:57 AM, Armando Stellato wrote: > > That was what I thought too, and actually, this would “give more sense to > LexicalSense” (sorry for the pun :-) ), as at least the reification of > senses would allow for an easy modelling of their ordering, by simply > attaching it as a property.**** > Still, I’m not convinced about the necessity of their existence, when > modelling lexical resources. Or better, they are ok, but, in the case of > WordNet, they are actually (IMHO) the synsets.**** > I will add more in the reply to Philipp, as there are further examples > there to comment.**** > Cheers,**** > Armando**** > ** ** > *From:* johnmccrae@gmail.com [mailto:johnmccrae@gmail.com] *On Behalf Of *John > McCrae > *Sent:* martedì 16 aprile 2013 10.19 > *To:* Philipp Cimiano > *Cc:* public-ontolex > *Subject:* Re: order of senses**** > ** ** > > You are quite right this is an important and explicit part of the WordNet > data model and should be preserved > > I believe including a senseNumber data property would cover this. Here is > the reference to the original WordNet documentation on this > http://wordnet.princeton.edu/wordnet/man/wndb.5WN.html#toc4**** > Example: > <cat:v> a lemon:LexicalEntry ; > lemon:sense <cat::2:29:0::>, <cat::2:35:0::> ; > <cat::2:29:0::> a lemon:Lexical Sense ;**** > wordnet:senseNumber "6"^^xsd:integer ;**** > > lemon:reference <VerbSynset76400> .**** > Regards, > John**** > ** ** > > ** ** > On Tue, Apr 16, 2013 at 10:02 AM, Philipp Cimiano < > cimiano@cit-ec.uni-bielefeld.de> wrote:**** > > Armando, all, > > re point 2: on the order of senses... > > Yes, according to the modelling proposed right now, this would be lost. > However, I do not think this is a major issue as we can add this > information to the sense objects ;-) as they are unique for a particular > word, i.e. > > forall w_1,w_2,s hasSense(w_1,s) & hasSense(w_2,s) -> w_1=w_2 > > Is this something we could agree on? > > Philipp. > > Am 15.04.13 19:41, schrieb Armando Stellato:**** > > Hi all,**** > First of all, thanks John for providing the example: through concrete > examples it is easier to discuss!**** > **** > A few comments (the same “disclaimer” from Elena holds for me: hope I > didn’t miss anything from other discussions, and in case, sorry in advance). > **** > **** > > 1) First of all (sorry a bit out of topic), I would ask for a > clarification, so that I can apply the policy to my examples too: I see the > “lemon:” prefix being used in many examples, and Lemon is an outcome of > Monnet project. Is it also the definitive name (or a temporary name) we are > giving to the model we are developing in this community group? I’ve been > using “ontolex:” as a fictitious prefix in my examples, and just got > “lemon” was being used by some of you, because those of you working on > Monnet have started right from examples they already built in the original > lemon. Sorry for asking what seems to be trivial, but I never got any > definitive statement on this, so, better to realign late than never :-D > Btw, what is written at the last row of: http://www.lemon-model.net/ seems > to confirm my hypothesis.**** > > ok..back to the original topic. Consider that a few of these observations > can actually be solved by completing the example, and do not necessarily > clash with it (or, at least, do not clash with what has been already > written, while I don’t know of what was thought for the rest).**** > > **** > > 2) With respect to Wordnet (which has explicitly ordered senses per > word, where I think this order originates – at least for some of the words > – from frequencies in SemCor) the sense ordering is lost: the synsets are > bound to the words by means of the sole listing of values, which in plain > RDF is unordered.**** > > **** > > 3) This is the most important observation: the use of lemon:sense . > Together with lemon:reference, lemon:sense should realize the bridge from > lexical entries to conceptual entities (of the domain ontology). Should we > use it reach the conceptual entities (e.g. synsets) of the lexical resource > AS WELL?. In terms of black-box compatibility, as we are modelling even > conceptual info of lexical resources (e.g. synsets in wordnet) through > some RDF language (e.g. SKOS), the thing is legal (the rdfs:range of > lemon:sense, providing it is wide enough, is respected), still I’m not sure > we want that. Shortly, I’m not sure if we want to apply exactly the same > 3-entities approach we are using for the lexicon-ontology model, to > modelling solely a lexical resource. > Let’s make an example. We have myont: which is a domain ontology (where we > have the entry myont:vomit) we are enriching with lexical content, possibly > from wordnet. Then we have the necessity of representing a direct linking > between some lexical entries (which may happen to be in wordnet or not) and > the domain entities of myont. > We would have thus this example, which I derived from both the WordNet > example, and the generic OntoLex example for enriching an ontology with > lexical content: > > <cat:v> > a lemon:LexicalEntry > lemon:sense <cat::2:29:0::>, <cat::2:35:0::> ; > <cat::2:29:0::> > a lemon:LexicalSense ; > lemon:reference <VerbSynset76400> . > lemon:reference myont:vomit . > **** > > Note that I’ve cut from the original example, the triples which are > non-useful to the discussion.**** > > Actually, in writing this revised example, I’m not even sure if the two > lemon:references should be put under the same sense umbrella, or I should > have used two different senses. This is mainly because I’m not sure about > the concept of “sense” here and what it represents. I see potential for > confusion even by looking at the Elena/John emails, as she rightly asks > about the use of skos:definition instead of lemon:definition. While I’m not > addressing here the use of a property or the other, the answer by John, > hinting at the fact that there could be two definitions, one for a sense, > and one for a synset (and consider that there could be a definition for the > element in the ontology), makes me wonder how many levels we should have! > Without delving too much in the appropriateness of this indirection for > what concerns the lexicon-ontology interface, and considering the sole > context of the representation of Wordnet (thus just the lexicon > perspective), to me the path from the LexicalEntry to the Synset is too > long. In wordnet we just say that a word is linked to a synset: period > (modulo the addition of an ordering). In particular, “sense” is a relation > which just tells me that synsetX is the i-th sense of word Y (and there’s a > many-to-many rel between words and synsets). > > …and this brings me back to our first discussions about the choice of the > term sense, when referring to the path from lexical entries to ontology > elements and about the nature of “elements-in-the-middle”. > In my view (to avoid terminological problems, I focus here on the path > between entities, and do not name the linking properties at all, so pls > consider all the arrows here have properties behind, in particular > lemon:sense and lemon:reference), when considering a mapping between a > lexical resource such as Wordnet, and an ontology, I would have seen such a > path: > LexicalEntry --> Synset --> OntologyResource > where, without using WordNet, the path would have been: > LexicalEntry --> [] --> OntologyResource > with [] a blanknode creating this gluing between them. > The second line is identical to what we have done until now and what has > been written in the examples in the “Specification of > Requirements/Lexicon-Ontology-Mapping”. In particular, the blanknode is an > instance of that element-in-the-middle (see: “Need for an object between > Lexical Entry and Ontology”) which still has not a name (and maybe it does > not need to have, see point 4 below). The first line is thus my > interpretation of how WordNet would have fit into that general template > (different from John’s example). > So, my idea would be to not replicate the complex lexicon-ontology linking > inside WordNet itself, and have instead a direct linking between lexical > entries and Synsets, and have THEN, outside of WordNet, a further link to > an ontology element. If you look at the two rows above (and how the WordNet > case fits the general case), this is pretty elegant, and does not introduce > a further level of indirection which appears not necessary. Plus, with this > method, the link from synsets to ontology elements is a necessary step to > instantiate the path above, while in the other case, you should introduce > it as an additional (and probably redundant) triple. You can see it in fact > in the turtle code above, which I modelled following both the general > example in “Specification of Requirements/Lexicon-Ontology-Mapping” and > John’s example on WordNet: there, VerbSynset is a separate entity from > myont:vomit. Actually, in that view, WordNet would become a separate > “ontology” which could then be mapped to a domain ontology, instead of > taking all the benefit of being seen as a lexical resource that can be > used, seamlessly within our model, to enrich a domain ontology.**** > > 4) IMHO, we should coin a specific vocabulary for each element of > the lexicon model, and then inherit (where appropriate) from SKOS/SKOSXL, > to distinguish such elements which belong only to a lexical resource from > those of any generic KOS. In the wiki, John wonders if what I called > “SemanticIndex” is not a skos:Concept, and I reply: “yes it is, in fact my > proposal is that our vocabulary for describing lexical resources can > inherit from the SKOS/SKOS-XL one”. If you look at the example, even John > did this, as the LexicalForm is nothing different from a skosxl:Label > (where lemon:writtenRep could be replaced by skosxl:literalForm) though it > may be worth creating a dedicated class. I would thus suggest: > LexicalForm rdfs:subClassOf skosxl:Label > but to use skosxl:literalForm instead of lemon:writtenRep > > maybe, in this specific case, we can even not reinvent a name, and totally > reuse the skosxl:Label, which after all is not so bad and pretty fitting > our necessities… (as it is already related to something specifically > thought for language). > > On the contrary, for LLD, I would necessarily restrict the class > skos:Concept to the class of elements which we expect to host things like > the WordNet Synset class. You can see my sample extension-point above in > the wiki (“Examples of Modelling in RDF (Alternative approach)”), though by > now mean I suggest <SemanticIndex> (that was a placeholder, taken from a > previous work), but in any case I think “Sense” is not appropriate > (lemon:sense well evokes the sense relation, while I don’t like to see a > class of “Senses”, that is, to me being a sense is more a role in a given > relationship, than a intrinsic property of an object).**** > > a. While I think that a more-specific-than-skos:Concept class would > be welcome for Lexical Linked Data (such as WordNet), and thus put in the > middle of the: LexicalEntry --> ??? --> OntologyResource template, I’m not > sure that the lemon:sense (first arrow) should be necessarily restricted to > it. John’s use of skos:Concept in the middle suggested me that even a > generic well-lexicalized KOS could be used for providing LexicalEntries and > Senses to enrich an ontology. However, I’m still thinking about it…**** > > 5) Another thing which comes to my mind, quite out of the WordNet > example, but not without consequences for it... What should be, in general, > the expected modelling behaviour when we have two terms which coincide, but > the syntactic use of which can follow different paths? > E.g., suppose we have a term with three senses. In the context of these > senses, with two of them (say 1 and 2), the term has exactly identical > variations (declensions for nouns pronouns and adjectives and conjugations > for verbs ), and maybe other information in common (think about > etymology!), while for the third sense, this may show differences in the > variations (e.g. a noun would have a different plural form, or a verb has a > different form in only one tense, when used with that sense). Should we > model them as 3 different lexical units, or should we agglomerate the two > identical ones into one LexicalEntry, and link it to senses 1 and 2? > This seems to be not related to modeling WordNet in the specific, because > variations, declinations etc.. are out of WordNet. However, this may affect > a model trying to reuse WordNet enriched with further information… Thus > it’s important when we consider how a WordNet modelling could be ported > inside an extended framework with no risk of inconsistency. > > I just thought about a solution for this: if we allow for skosxl:Labels to > be directly attached to Synsets (or whatever it is the superclass for > them), and then we state the following rule: > LexicalEntry -> lemon:canonicalForm -> skosxl:Label > LexicalEntry -> lemon:sense -> <asynset> > ------------------------------------ > skosxl:Label -> ???:sense (whatever it is called) -> <asynset> > > this would allow for the complex structure we expect in general, but also > allow for a more neutral fit of WordNet. In fact, instead of having the > third triple as inferred, for WordNet we could just explicitly mention the > third one, and do not put potentially compromising information (which, in > any case, is out of WordNet, as noted by John in his reply to Elena). > The “???:sense (whatever it is called)” could even be lemon:sense itself, > providing that its range is LexicalEntry+skosxl:Label. > However, I still have to think more about that…**** > **** > One more thing, observation in point 2 above made me think once more that > we should be clearer in our objectives:**** > Fact: since we have to model ontology-lexicon interfaces, and there isn’t > much out there for representing lexical info (limited to RDF, I mean); we > have thus to provide a model for the linguistic part, before “attaching” it > to the ontology part. Now, the objective could be:**** > **** > > 1) We want to model lexical knowledge, and we give a model for this. > WordNet may be (in part) more fine grained than our model…no big trouble, > WordNet is WordNet, and our model is our model… we’ll be missing those > details..**** > > a. A slightly different interpretation of the above: we want to > model lexical knowledge, AND we decide WordNet IS the model (at least for > the monolingual word-description needs..I leave out FrameNet et similia > from this context of discussion). No big deal with other alternative > resources to WordNet..**** > > 2) We want to model existing lexical resources. Thus WordNet, as > well as other resources (maybe differently organized) are all important*** > * > **** > Obviously, there are endless colours in the middle of the above, as we > could be in case 1 or 2, and still think WordNet is so important that it > has to be fully covered (also because, in this way, Princeton could decide > to natively output each new release of WordNet in RDF too according to our > model).**** > **** > Cheers,**** > **** > Armando**** > **** > P.S: I’ve brought a couple of small fixes to the page: > http://www.w3.org/community/ontolex/wiki/Specification_of_Requirements/Lexicon-Ontology-Mapping#Summary_on_Requirements_on_the_Lexicon-Ontology-Mapping_.28Synthesis_by_PC.29 > which we already discussed 2 or 3 meetings ago.**** > **** > **** > *From:* johnmccrae@gmail.com [mailto:johnmccrae@gmail.com<johnmccrae@gmail.com> > ] *On Behalf Of *John McCrae > *Sent:* venerdì 12 aprile 2013 16.10 > *To:* public-ontolex > *Subject:* WordNet modelling in Lemon and SKOS**** > **** > Hi all,**** > **** > Here is the proposed modelling of WordNet as lemon and SKOS (using > skos:Concept for synsets)**** > **** > > http://www.w3.org/community/ontolex/wiki/Specification_of_Requirements/Linked_Data#Example:_WordNet_as_lemon-SKOS > **** > **** > Any comments?**** > **** > Regards,**** > John**** > > > > > **** > > -- **** > > Prof. Dr. Philipp Cimiano**** > > Semantic Computing Group**** > > Excellence Cluster - Cognitive Interaction Technology (CITEC)**** > > University of Bielefeld**** > > ** ** > > Phone: +49 521 106 12249**** > > Fax: +49 521 106 12412**** > > Mail: cimiano@cit-ec.uni-bielefeld.de**** > > ** ** > > Room H-127**** > > Morgenbreede 39**** > > 33615 Bielefeld**** > > ** ** > > >
Received on Tuesday, 16 April 2013 10:42:50 UTC