- From: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
- Date: Tue, 16 Apr 2013 10:02:59 +0200
- To: public-ontolex@w3.org
- Message-ID: <516D05B3.6010205@cit-ec.uni-bielefeld.de>
Armando, all, re point 2: on the order of senses... Yes, according to the modelling proposed right now, this would be lost. However, I do not think this is a major issue as we can add this information to the sense objects ;-) as they are unique for a particular word, i.e. forall w_1,w_2,s hasSense(w_1,s) & hasSense(w_2,s) -> w_1=w_2 Is this something we could agree on? Philipp. Am 15.04.13 19:41, schrieb Armando Stellato: > > Hi all, > > First of all, thanks John for providing the example: through concrete > examples it is easier to discuss! > > A few comments (the same “disclaimer” from Elena holds for me: hope I > didn’t miss anything from other discussions, and in case, sorry in > advance). > > 1)First of all (sorry a bit out of topic), I would ask for a > clarification, so that I can apply the policy to my examples too: I > see the “lemon:” prefix being used in many examples, and Lemon is an > outcome of Monnet project. Is it also the definitive name (or a > temporary name) we are giving to the model we are developing in this > community group? I’ve been using “ontolex:” as a fictitious prefix in > my examples, and just got “lemon” was being used by some of you, > because those of you working on Monnet have started right from > examples they already built in the original lemon. Sorry for asking > what seems to be trivial, but I never got any definitive statement on > this, so, better to realign late than never :-D > Btw, what is written at the last row of: http://www.lemon-model.net/ > seems to confirm my hypothesis. > > ok..back to the original topic. Consider that a few of these > observations can actually be solved by completing the example, and do > not necessarily clash with it (or, at least, do not clash with what > has been already written, while I don’t know of what was thought for > the rest). > > 2)With respect to Wordnet (which has explicitly ordered senses per > word, where I think this order originates – at least for some of the > words – from frequencies in SemCor) the sense ordering is lost: the > synsets are bound to the words by means of the sole listing of values, > which in plain RDF is unordered. > > 3)This is the most important observation: the use of lemon:sense . > Together with lemon:reference, lemon:sense should realize the bridge > from lexical entries to conceptual entities (of the domain ontology). > Should we use it reach the conceptual entities (e.g. synsets) of the > lexical resource AS WELL?. In terms of black-box compatibility, as we > are modelling even conceptual info of lexical resources (e.g. synsets > in wordnet) through some RDF language (e.g. SKOS), the thing is legal > (the rdfs:range of lemon:sense, providing it is wide enough, is > respected), still I’m not sure we want that. Shortly, I’m not sure if > we want to apply exactly the same 3-entities approach we are using for > the lexicon-ontology model, to modelling solely a lexical resource. > Let’s make an example. We have myont: which is a domain ontology > (where we have the entry myont:vomit) we are enriching with lexical > content, possibly from wordnet. Then we have the necessity of > representing a direct linking between some lexical entries (which may > happen to be in wordnet or not) and the domain entities of myont. > We would have thus this example, which I derived from both the WordNet > example, and the generic OntoLex example for enriching an ontology > with lexical content: > > <cat:v> > a lemon:LexicalEntry > lemon:sense <cat::2:29:0::>, <cat::2:35:0::> ; > <cat::2:29:0::> > a lemon:LexicalSense ; > lemon:reference <VerbSynset76400> . > lemon:reference myont:vomit . > > Note that I’ve cut from the original example, the triples which are > non-useful to the discussion. > > Actually, in writing this revised example, I’m not even sure if the > two lemon:references should be put under the same sense umbrella, or I > should have used two different senses. This is mainly because I’m not > sure about the concept of “sense” here and what it represents. I see > potential for confusion even by looking at the Elena/John emails, as > she rightly asks about the use of skos:definition instead of > lemon:definition. While I’m not addressing here the use of a property > or the other, the answer by John, hinting at the fact that there could > be two definitions, one for a sense, and one for a synset (and > consider that there could be a definition for the element in the > ontology), makes me wonder how many levels we should have! > Without delving too much in the appropriateness of this indirection > for what concerns the lexicon-ontology interface, and considering the > sole context of the representation of Wordnet (thus just the lexicon > perspective), to me the path from the LexicalEntry to the Synset is > too long. In wordnet we just say that a word is linked to a synset: > period (modulo the addition of an ordering). In particular, “sense” is > a relation which just tells me that synsetX is the i-th sense of word > Y (and there’s a many-to-many rel between words and synsets). > > …and this brings me back to our first discussions about the choice of > the term sense, when referring to the path from lexical entries to > ontology elements and about the nature of “elements-in-the-middle”. > In my view (to avoid terminological problems, I focus here on the path > between entities, and do not name the linking properties at all, so > pls consider all the arrows here have properties behind, in particular > lemon:sense and lemon:reference), when considering a mapping between a > lexical resource such as Wordnet, and an ontology, I would have seen > such a path: > LexicalEntry --> Synset --> OntologyResource > where, without using WordNet, the path would have been: > LexicalEntry --> [] --> OntologyResource > with [] a blanknode creating this gluing between them. > The second line is identical to what we have done until now and what > has been written in the examples in the “Specification of > Requirements/Lexicon-Ontology-Mapping”. In particular, the blanknode > is an instance of that element-in-the-middle (see: “Need for an object > between Lexical Entry and Ontology”) which still has not a name (and > maybe it does not need to have, see point 4 below). The first line is > thus my interpretation of how WordNet would have fit into that general > template (different from John’s example). > So, my idea would be to not replicate the complex lexicon-ontology > linking inside WordNet itself, and have instead a direct linking > between lexical entries and Synsets, and have THEN, outside of > WordNet, a further link to an ontology element. If you look at the two > rows above (and how the WordNet case fits the general case), this is > pretty elegant, and does not introduce a further level of indirection > which appears not necessary. Plus, with this method, the link from > synsets to ontology elements is a necessary step to instantiate the > path above, while in the other case, you should introduce it as an > additional (and probably redundant) triple. You can see it in fact in > the turtle code above, which I modelled following both the general > example in “Specification of Requirements/Lexicon-Ontology-Mapping” > and John’s example on WordNet: there, VerbSynset is a separate entity > from myont:vomit. Actually, in that view, WordNet would become a > separate “ontology” which could then be mapped to a domain ontology, > instead of taking all the benefit of being seen as a lexical resource > that can be used, seamlessly within our model, to enrich a domain > ontology. > > 4)IMHO, we should coin a specific vocabulary for each element of the > lexicon model, and then inherit (where appropriate) from SKOS/SKOSXL, > to distinguish such elements which belong only to a lexical resource > from those of any generic KOS. In the wiki, John wonders if what I > called “SemanticIndex” is not a skos:Concept, and I reply: “yes it is, > in fact my proposal is that our vocabulary for describing lexical > resources can inherit from the SKOS/SKOS-XL one”. If you look at the > example, even John did this, as the LexicalForm is nothing different > from a skosxl:Label (where lemon:writtenRep could be replaced by > skosxl:literalForm) though it may be worth creating a dedicated class. > I would thus suggest: > LexicalForm rdfs:subClassOf skosxl:Label > but to use skosxl:literalForm instead of lemon:writtenRep > > maybe, in this specific case, we can even not reinvent a name, and > totally reuse the skosxl:Label, which after all is not so bad and > pretty fitting our necessities… (as it is already related to something > specifically thought for language). > > On the contrary, for LLD, I would necessarily restrict the class > skos:Concept to the class of elements which we expect to host things > like the WordNet Synset class. You can see my sample extension-point > above in the wiki (“Examples of Modelling in RDF (Alternative > approach)”), though by now mean I suggest <SemanticIndex> (that was a > placeholder, taken from a previous work), but in any case I think > “Sense” is not appropriate (lemon:sense well evokes the sense > relation, while I don’t like to see a class of “Senses”, that is, to > me being a sense is more a role in a given relationship, than a > intrinsic property of an object). > > a.While I think that a more-specific-than-skos:Concept class would be > welcome for Lexical Linked Data (such as WordNet), and thus put in the > middle of the: LexicalEntry --> ??? --> OntologyResource template, > I’m not sure that the lemon:sense (first arrow) should be necessarily > restricted to it. John’s use of skos:Concept in the middle suggested > me that even a generic well-lexicalized KOS could be used for > providing LexicalEntries and Senses to enrich an ontology. However, > I’m still thinking about it… > > 5)Another thing which comes to my mind, quite out of the WordNet > example, but not without consequences for it... What should be, in > general, the expected modelling behaviour when we have two terms which > coincide, but the syntactic use of which can follow different paths? > E.g., suppose we have a term with three senses. In the context of > these senses, with two of them (say 1 and 2), the term has exactly > identical variations (declensions for nouns pronouns and adjectives > and conjugations for verbs ), and maybe other information in common > (think about etymology!), while for the third sense, this may show > differences in the variations (e.g. a noun would have a different > plural form, or a verb has a different form in only one tense, when > used with that sense). Should we model them as 3 different lexical > units, or should we agglomerate the two identical ones into one > LexicalEntry, and link it to senses 1 and 2? > This seems to be not related to modeling WordNet in the specific, > because variations, declinations etc.. are out of WordNet. However, > this may affect a model trying to reuse WordNet enriched with further > information… Thus it’s important when we consider how a WordNet > modelling could be ported inside an extended framework with no risk of > inconsistency. > > I just thought about a solution for this: if we allow for > skosxl:Labels to be directly attached to Synsets (or whatever it is > the superclass for them), and then we state the following rule: > LexicalEntry -> lemon:canonicalForm -> skosxl:Label > LexicalEntry -> lemon:sense -> <asynset> > ------------------------------------ > skosxl:Label -> ???:sense (whatever it is called) -> <asynset> > > this would allow for the complex structure we expect in general, but > also allow for a more neutral fit of WordNet. In fact, instead of > having the third triple as inferred, for WordNet we could just > explicitly mention the third one, and do not put potentially > compromising information (which, in any case, is out of WordNet, as > noted by John in his reply to Elena). > The “???:sense (whatever it is called)” could even be lemon:sense > itself, providing that its range is LexicalEntry+skosxl:Label. > However, I still have to think more about that… > > One more thing, observation in point 2 above made me think once more > that we should be clearer in our objectives: > > Fact: since we have to model ontology-lexicon interfaces, and there > isn’t much out there for representing lexical info (limited to RDF, I > mean); we have thus to provide a model for the linguistic part, before > “attaching” it to the ontology part. Now, the objective could be: > > 1)We want to model lexical knowledge, and we give a model for this. > WordNet may be (in part) more fine grained than our model…no big > trouble, WordNet is WordNet, and our model is our model… we’ll be > missing those details.. > > a.A slightly different interpretation of the above: we want to model > lexical knowledge, AND we decide WordNet IS the model (at least for > the monolingual word-description needs..I leave out FrameNet et > similia from this context of discussion). No big deal with other > alternative resources to WordNet.. > > 2)We want to model existing lexical resources. Thus WordNet, as well > as other resources (maybe differently organized) are all important > > Obviously, there are endless colours in the middle of the above, as we > could be in case 1 or 2, and still think WordNet is so important that > it has to be fully covered (also because, in this way, Princeton could > decide to natively output each new release of WordNet in RDF too > according to our model). > > Cheers, > > Armando > > P.S: I’ve brought a couple of small fixes to the page: > http://www.w3.org/community/ontolex/wiki/Specification_of_Requirements/Lexicon-Ontology-Mapping#Summary_on_Requirements_on_the_Lexicon-Ontology-Mapping_.28Synthesis_by_PC.29 > which we already discussed 2 or 3 meetings ago. > > *From:*johnmccrae@gmail.com [mailto:johnmccrae@gmail.com] *On Behalf > Of *John McCrae > *Sent:* venerdì 12 aprile 2013 16.10 > *To:* public-ontolex > *Subject:* WordNet modelling in Lemon and SKOS > > Hi all, > > Here is the proposed modelling of WordNet as lemon and SKOS (using > skos:Concept for synsets) > > http://www.w3.org/community/ontolex/wiki/Specification_of_Requirements/Linked_Data#Example:_WordNet_as_lemon-SKOS > > Any comments? > > Regards, > > John > -- Prof. Dr. Philipp Cimiano Semantic Computing Group Excellence Cluster - Cognitive Interaction Technology (CITEC) University of Bielefeld Phone: +49 521 106 12249 Fax: +49 521 106 12412 Mail: cimiano@cit-ec.uni-bielefeld.de Room H-127 Morgenbreede 39 33615 Bielefeld
Received on Tuesday, 16 April 2013 08:03:59 UTC