Re: WordNet modelling in Lemon and SKOS from Aldo Gangemi on 2013-04-25 (public-ontolex@w3.org from April 2013)

From: Aldo Gangemi <aldo.gangemi@cnr.it>
Date: Thu, 25 Apr 2013 17:00:16 +0200
To: John McCrae <jmccrae@cit-ec.uni-bielefeld.de>
Cc: Aldo Gangemi <aldo.gangemi@cnr.it>, Armando Stellato <stellato@info.uniroma2.it>, Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>, public-ontolex <public-ontolex@w3.org>
Message-Id: <3B660A41-01BF-4E50-BF41-24790284C1CD@cnr.it>
John, surely I can agree to disagree, but I think we have still room for clarification :)

I would like to distinguish between:

1) structural issues: forms that knowledge can take due to syntactic constraints in KR languages
2) "de re" ("about the thing") issues
3) "de dicto" ("about what is said") issues

I am afraid we are mixing them all:

(2) vs. (3): a wnschema:WordSense wn:word-sense-book is a name (de dicto) for a "word sense" in the WordNet ontology (de re)

(1) wnschema:WordSense can be represented as an owl:Class, or as a uml:AssociationClass, or as a fol:Function, or …

From your sentence:

> "it is clear that a lexical sense can be "reduced" to a lexeme (a collection of forms or LexicalEntry)"

I cannot imagine any reduction: we have a lexeme, which is a lexical entry for a collection of forms, then still an expression, and we have a word sense, which is the meaning of the lexeme. Two things. As a matter of fact, most dictionaries have one lexeme for multiple word senses: lexemes are only distinguished when the history of the lexeme is different (homographs). Than a lexeme used as lexical entry is hardly a meaning: its word senses are meanings. Similarly in WordNet: certain relations hold for expressions, others for lexical senses.

Now, a word sense, e.g. wn:word-sense-book, is a way of asserting the presence of an intensional meaning, but since we have to use symbols to communicate, we give it a name. And that name is a semio:Expression, no surprise … but I could call it "MickeyMouse", and the lexical sense will still be the same, i.e. depending on the lexeme wn:word-book, if I intend to talk about books instead of Mickey Mouse (the comics character).

The same applies to all linguistic games, including lexical resources, ontologies, etc. Let me make a more mundane example. 

Let's take the owl:Class "Book". 
It is an extensional entity (its formal interpretation as the class of all books I want to consider as such); this feature is the main interest in OntoLex I suppose: we want to consider it a semio:Reference. 
Certainly, "Book" can be also an intensional entity (my conceptualization of books); in that case, we might want to introduce another entity of type semio:Meaning. 
And certainly, "Book" has also an identifier: <http://www.foo.org/myont/Book>, whose qname is "Book", therefore we might be interested in introducing also an entity of type semio:Expression. 
The reason why we may want to introduce one, two, or three entities for the owl:Class "Book" totally depends on our requirements, still it's quite easy to distinguish between them if we comply to semiotics.

For example, if I want to represent lexica in OWL, I'll use OntoLex with a default assumption of semio:Reference type for elements from an ontology, and a default assumption of semio:Meaning for word senses, synsets, frames, etc. from lexica. 
However, if I use string-matching techniques to align the names of classes in different ontologies, it's the expressions that are used by the algorithm. We may desire or not to make that explicit.

A concrete application: our Tìpalo tool [1] extracts a RDF graph from a definitional text from Wikipedia, and disambiguates it against WordNet, which is on its turn aligned to DOLCE so that minimal ambiguity is left when adopting an extensional value for WordNet synsets.
In this case, the word sense disambiguation component takes the definition (basically a semio:Meaning) as a string, and associates a word sense or a synset to some of the words from the definition. In this case WSD works on expressions to infer meanings (via appropriate expressions that are vectorized out of relevant text for word senses or synsets, e.g. parallel corpora).
Then the alignment to DOLCE works on meanings (synsets or word senses) to derive ontology references (via mapping synset intension to DOLCE classes intension). 
These assumptions are *stipulated* for this particular application, but I do not want to assume by default that lexical senses are references or expressions.
That's what I call both a rigorous and flexible ontology for semiotics :)

Sorry for the long reply, but our discussions (until now) convince me more and more that adopting a triangular semantics for the ontolex interface is not only logically and linguistically correct, but also very practical.

Aldo

[1] http://wit.istc.cnr.it/stlab-tools/tipalo



On Apr 25, 2013, at 11:22:00 AM , John McCrae <jmccrae@cit-ec.uni-bielefeld.de> wrote:

> Hi Aldo,
> 
> Maybe we will have to agree to disagree here. But I don't get your argument: You say "In all these cases, something stays the same: it's the intensional meaning of "LexicalSense""... however something else also stays the same, the set of forms that can be used to express this concept (this is the distinction between LexicalSense and LexicalConcept). By that token, does it not follow by the same argument that LexicalSense is a subclass of semio:Expression? In fact, it is clear that a lexical sense can be "reduced" to a lexeme (a collection of forms or LexicalEntry).
> 
> Regards,
> John
> 
> 
> On Wed, Apr 24, 2013 at 6:42 PM, Aldo Gangemi <aldo.gangemi@cnr.it> wrote:
> 
> On Apr 24, 2013, at 5:49:02 PM , John McCrae <jmccrae@cit-ec.uni-bielefeld.de> wrote:
> 
>> 
>> 
>> 
>> On Wed, Apr 24, 2013 at 5:29 PM, Aldo Gangemi <aldo.gangemi@gmail.com> wrote:
>> Hi,
>> 
>> 
>> On Wed, Apr 24, 2013 at 11:10 AM, John McCrae <jmccrae@cit-ec.uni-bielefeld.de> wrote:
>> Hi all,
>> 
>> I am glad we are close to an understanding :)
>> 
>> I agree that WordNet's synset could be a subclass of a Lexical Concept class, however might it not make more sense (especially with respect to dissemination) to just call it Synset?
>> 
>> Note: LexicalSense cannot be a subclass of semio:Meaning, it should be a subtype of the tuple (semio:Expression,semio:Meaning)
>> 
>> I do not understand this. A class cannot be a subclass of a tuple, unless the (set of) tuple(s) is reified, and then becomes a class as well, which is what Armando intended (please confirm :)).
>> In all cases, if you mean that a word sense is dependent on a (unique) expression and a (unique) synset, that's easily captured in OWL:
>> 
>> ontolex:LexicalSense rdfs:subClassOf semio:Meaning .
>> (unique expression:)
>> ontolex:LexicalSense rdfs:subClassOf _:restriction .
>> _:restriction rdf:type owl:Restriction .
>> _:restriction owl:onProperty semio:expressedBy .
>> _:restriction owl:someValuesFrom :LexiconExpression .
>> _:restriction owl:cardinality "1"^^xsd:NonNegativeInteger .
>> (unique synset:)
>> ontolex:LexicalSense rdfs:subClassOf _:restriction1 .
>> _:restriction1 rdf:type owl:Restriction .
>> _:restriction1 owl:onProperty wordnet:inSynset .
>> _:restriction1 owl:someValuesFrom wordnet:Synset .
>> _:restriction1 owl:cardinality "1"^^xsd:NonNegativeInteger .
>> 
>> An alternative design pattern can be applied by defining new owl:FunctionalProperty(ies) that are subproperties of e.g. semio:expressedBy and semio:inSynset.
>>  
>> I meant that if LexicalSense is a reification of a link its type should be Tuple<semio:Expression,semio:Meaning>, of course as OWL does not support any kind of generic typing this is slightly irrelevant, but in systems that do it should not in general be the case that:
>> 
>> Tuple<A,B> ⊑ A
>> 
>> Hence my understanding that the LexicalSense is not a semio:Meaning.
> 
> Dear John, 
> 
> I suggest to distinguish structural vs. semantic issues. Being a tuple is just a structural fact: in a tuple I can represent a lot of different creatures: events, facts, relations, situations, truth conditions, functions, … but typically we do not assume that they are the same kind of stuff only because they can be represented as tuples, or because they are all reifications of a link.
> 
> In other words, it's fine to say that a lexical sense is representable as a tuple from the universe <semio:Expression,semio:Meaning>, but it can be representable as well as an individual of a class, or as a function over expressions. In all these cases, something stays the same: it's the intensional meaning of "LexicalSense", which (sorry for talking semiotics about semiotic entities!) is a subclass of Meaning, since it is expressed by expressions, and/or can be the conceptualization of a (collection of) references. As a matter of fact, when we are sure about its extensional interpretation, a lexical sense can be "lifted" as an ontology class or individual.
> 
> Aldo
> 
>>  
>> 
>>  
>> 
>> I would however be strongly in favour of having the following path still in the model:
>> 
>> LexicalEntry --sense--> LexicalSense --reference--> (OntologyEntity)*
>> 
>> The primary reason for this is simply to allow for backwards compatibility with the current lemon model.
>> 
>> Furthermore, I think that the distinction Aldo makes between type A and type B modelling requirements is valid and important. In particular, it seems that type A modelling will involve not using an ontology entity (using a three-element path like below) and type B modelling will not use LexicalConcept (using a path as above). 
>> 
>> LexicalEntry --sense--> LexicalSense --lexConcept--> LexicalConcept
>> 
>> There is another option as well a type AB modelling where there is both intensional and extensional modelling, or more commonly someone wishes to link a type A resource to a type B resource. So we need a link between the Lexical Concept and the Ontology Entity (as exists in all proposals).
>> 
>> LexicalConcept --conceptualizes--> (OntologyEntity)
>> 
>> However, this has a drawback, in that it allows equivalent paths in the model namely sense/reference and sense/lexConcept/conceptualizes. This makes the model harder to apply and brings back the discussion of Philipp's shortcut property between LexicalEntries and OntologyEntity. Therefore there are two options
>> Fix the model as a four element path (sense/lexConcept/conceptualizes) and drop other properties (e.g., reference)
>> Allow for ambiguity in the expression of the ontology-lexicon connection (in fact Philipp's shortcut would now be 'denotes' in my proposal)
>> While I don't like either option I would have to admit that 2 is probably better
>> 
>> The second clear issue that comes from this modelling is to do with the levels of annotation/linking. By which I mean that we need to be clear in the model which annotations & relationships belong should be part of the LexicalSense vs. LexicalConcept vs. OntologyEntity
>> 
>> My guess is the following holds:
>> 
>> LexicalSense
>> ------------
>> 
>> * Register
>> * Translation
>> * Sense examples
>> * (Some) selection restrictions (e.g., 'gehen'/'fahren'@de... 'ageru'/'kureru'/'kudasaru'@ja-Latn)
>> 
>> 
>> The following relations were already assigned domains and ranges based on WordNet assumptions in the WordNet-OWL schema:
>> 
>> wnschema:WordSense (or some subclass) is the domain and range of the following properties:
>> antonymOf
>> derivationallyRelated
>> This should probably be on the word, although WordNet does not differentiate different etymologies of a word, so perhaps it is allowed here.
>> participle 
>> adjectivePertainsTo
>> adverbPertainsTo
>> 
>> the ones you propose are fair enough I think.
>> 
>>  
>> LexicalConcept
>> --------------
>> 
>> * Anotnymy
>> * Hypernymy/Hyponymy (?)
>> * Quality models (e.g., 'big' vs 'huge')
>> * Gloss/Definition (?)
>> 
>> 
>> wnschema:Synset (or some subclass) is the domain and range of the following properties:
>> attribute
>> causes
>> classifies
>> entails
>> instanceOf
>> meronymOf
>> hyponymOf
>> sameVerbGroupAs
>> similarTo
>> gloss
>> 
>> Among the ones you propose, "antonymy" is certainly wrong (holds between senses, not synsets), ok for the others.
>>  
>> OntologyEntity
>> --------------
>> 
>> * Formal super/subclassing
>> * Domain/Range restrictions
>> * Axioms
>> * Gloss/Definition (?)
>> 
>> 
>> These ones are ok, but I do not see why we should include them in the OntoLex model, since they are already defined in RDFS, OWL, etc. I imagine there can be requirements for that, e.g. to gather a meta-model of OWL, but they already exist. For example, NeOn project produced plenty of such meta-models, we should not reinvent the wheel.
>> Sure, I was not proposing to include these in the model but they are just here for comparison.
>> 
>> 
>> Regards,
>> John
>> 
>> PS.
>> * The naming of the OntologyEntity class is technically irrelevant as it cannot be an owl:Class as object properties, data properties and individuals (as well as datatypes and sets) are valid so it is best that formally it's name is simply omitted. 
>> 
>> 
>> I do not understand this sentence, maybe some typo. If you mean that any element in the semio:Reference (or at least in the ontolex:OntologyEntity, or in your "omitted" class) class results to be an individual, and therefore is rdf:type owl:Thing, then I can agree; even in case of classes and properties as references, they would be punned as individuals.
>> Yep, that is what I meant
>> 
>> Regards,
>> John 
>> 
>> Ciao
>> Aldo
>>  
>> 
>> On Wed, Apr 24, 2013 at 3:38 AM, Armando Stellato <stellato@info.uniroma2.it> wrote:
>> Hi Aldo,
>> 
>> Fine. Actually since the naming of concepts was still to be assessed, and since in some cases we could have been reusing specific classes from existing vocabularies, I used that informal labeling in the upper part of the boxes for clarifying their role, and an explicit reference to the proposed class in the lower one.
>> Thus "target conceptual model" was intended to capture actually elements of possibly different models (and in fact the least subsuming class is owl:Thing) so I confirm your hypothesis.
>> I must admit I only grasp partially the reason for which we should consider differently type-A and type-B models. My perspective, wrt, for instance, the triangle of Meaning, is that in-any-case what we formally write are still symbols (progressively richer in their description  ), which are then translated into references in our mind which refer to referents in the world.
>> And in this sense a synset, for instance, is still a symbol which, thanks to the set of synonyns in it, and the gloss etc.. better drives the access to a reference in our minds than a single word. In terms of Sinn and Bedeutung, an owl:Class has intensional properties as much as a skos:Concept has, plus it may restrict (through a set of formal constraints) its extension, the interpretations of which, however, are still infinite. In this sense, Words, skos:Concepts, owl:Classes are all "expressions", and referents are totally out of our representation game. Thus, any meaning/reference distinction is not really clear to me. Much the same way, how would u consider an owl:Individual wrt a skos:Concept (well actually a concept is an individual in owl terms..) Are not them both purely intensional objects?  
>> However, I may be easily wrong in that, and will not delve further in the discussion, so one practical question:
>> Suppose I've a domain concept scheme (e.g. Agrovoc) and a "conceptualized" lexical resources such as WordNet. Beyond any possible linking to meaning/reference etc.. would you see it as possible to have some form of "tagging" of the domain concept scheme with wordnet's synsets, where it is clear (in ontolex) that the synsets are not (only) mere skos:Concepts (thus to be mapped through ordinary mapping relation, eg from skos) and are instead lexical objects (instances of LexicalConcept in particular) which can be used to enrich the domain concepts?
>> 
>> Cheers,
>> Armando
>> 
>> Da: Aldo Gangemi
>> Inviato: 24/04/2013 00.28
>> 
>> A: Armando Stellato
>> Cc: Aldo Gangemi; 'John McCrae'; 'Philipp Cimiano'; 'public-ontolex'
>> 
>> Oggetto: Re: WordNet modelling in Lemon and SKOS
>> 
>> Hi Armando, John, all,
>> 
>> On Apr 23, 2013, at 11:19:48 PM , "Armando Stellato" <stellato@info.uniroma2.it> wrote:
>> 
>>> Dear John,
>>>  
>>> After seeing your updated scheme, I think we are almost there. I had a short call with Aldo for checking the only one thing I was a bit uncertain of in his email (the double subclassing he proposed for WordNet’s WordSense/Synset under the ontolex:LexicalSense umbrella).
>>> I’m resuming a few points here, and I ask Aldo to confirm if I’m properly reporting what we discussed (obviously I’m cutting most of the conversation and report only the main questions and where we ended up).
>> 
>> thanks for the summary :)
>> 
>>>  
>>> Armando: Why both wn:WordSense and wn:Synset subclasses of LexicalSense?
>>> Aldo: they are both a form of Meaning. These can be totally disjoint classes as u said in your email, still being under the same superclass.
>>> Armando: Ok, let’s go back to the linking to semiotics.owl… ok for both wn:WordSense and wn:Synset under semio:Meaning…they are both a form of meaning (thus both rdfs:subClassOf semio:Meaning) and I agree… but then, the engineer in me tells: <ok, this is a proper “tagging”, but how can these be used operatively?> I mean, ok for the general Meaning class in semiotics.owl, but LexicalSense cannot be an Umbrella for both too…our ontolex model should be general enough to cover different resources, and specific enough to cover in detail the most important aspects of them. To me, I would like WordNet to be opaquely handled by agents as an instance of a Lexical Resouce modeled in OntoLex. I’m thinking about some of the use cases, where smart agents covering given tasks (such as Ontology Mapping) may benefit of the implicit perspective on WordNet given through OntoLex glasses (a monolingual resource, with a conceptual structure etc…), and can adapt this sort of “ontolex fingerprint” of the resource into their general mapping strategies (this is also where the metadata part of the language will come into play). “Plugging” another resource should work as well, as much as its content can be seen through a proper mapping inside the OntoLex vocabulary.
>>> So I suggest to make explicit in our model the existence of “Senses of LexicalEntries”, let’s call them LexicalSense or just Sense (e.g. specifically, a superclass of WordSenses in wordnet) and LexicalConcepts (specifically, a superclass of synsets in WordNet). Then I agreed that both Sense and LexicalConcept are tagged (subClassOf) as (different types of) Meanings, for the purpose of properly representing them under the Triad in semiotics.owl
>>> Aldo agrees on having these two distinct elements in OntoLex too, and bound them under the common umbrella of semio:Meaning.
>> 
>> Confirmed. I have no issue about creating intermediate classes whatsoever, provided we all agree on the intuition about expressions, (intensional) meanings, and (extensional) references.
>> 
>> Concerning the diagram, I'm ok with links and names. 
>> 
>> My only observation is about "TargetConceptualModel" (not really discussed with Armando): if that is a class of conceptual models (as the name suggests), why should it be a subclass of Reference. I'd call it better OntologyEntity (as Lemon does, as well as LRI, the multilingual ontolex model made in NeOn project in 2008), and put a link between OntologyEntity and the ontology that defines it.
>> However, maybe you want to talk about arbitrary conceptual models and their elements. For this I think we need some more clarification, because there are two types of conceptual models:
>> 
>> A) purely intensional conceptual models, like SKOS models, classification schems, thesauri, synsets, lexical frames, etc.
>> B) formally interpreted conceptual models, like ontologies, ER schemas, UML class diagrams (under ER-like semantics), etc.
>> 
>> For type-A conceptual models, I am still recalcitrant to accept their elements as references, since no clear extensional intuition is granted, except under a sort of "stipulation" by which I accept the risks of interpreting them extensionally (old SKOS did that by having skos:Concept as both rdfs:subClassOf owl:Thing and of rdfs:Class). I think no default extensional choice like that should be made. 
>> 
>> For type-B conceptual models, we can safely adopt the extensional interpretation.
>> 
>> Now, since this community group works under the semantic web and linked data umbrella, I do not see the necessity of forcing our model to deal with debatable choices wrt type-A conceptual models, which can be instead interpreted in the context of the Meaning class (that's because I put skos:Concept as a subclass of semio:Meaning).
>> 
>> I won't be able (last time hopely) to attend Friday's telco, but will be active in the email discussion.
>> Ciao
>> Aldo
>> 
>>>  
>>> I’m attaching (and reporting here below) an updated version of the model I sent in my last email, with the mapping to Semiotics.owl which followed the discussion with Aldo. As you may see, it is pretty similar to the last one you sent (modulo naming choices and the double linking to semio:Meaning).
>>> Regarding chosen names, just a couple of comments:
>>>  
>>> 1)      I suggested, as a OntoLex superclass for Synset, the name Lexical Concept (ref. Miller’s paper, where he defines synsets as a form of “Lexical Concepts”). This captures the idea of a given set of LexicalEntries hinting at a (non explicit nor formally defined) concept. Note (not in the figure) that this LexicalConcept may be a subclass of skos:Concept. An alternative could be “LexicalizedConcept”, though the former one surely sounds better :-)
>>> 2)      Conversely, for the other class reifying the sense relationship, I’m not sure about the appropriateness of the name LexicalSense, as in this name “Lexical” seems an adjective of “Sense”. But, IMHO, it is not. LexicalSense is more specifically the sense of a given Lexical Entry. Thus the proper name should be LexicalEntrySense (in fact, in WordNet - limiting lexical entries to be words - we have the class WordSense). However LexicalEntrySense is rather long and ambiguous to be parsed. Other choice could be SenseOfLexicalEntry (rather ugly), or simply (my preference), Sense. Btw, just my small note on that and absolutely can be left as is…but I really cannot grasp the meaning of such an expression.  Simply, the step from the expression “LexicalSense” to its intended meaning of “Sense of a Lexical Entry” to me is not intuitive.
>>> 3)      I chose the ontolex:sense property to go from LexicalEntry to LexicalConcept. To me it is intuitive, as (grounding to WordNet, for instance), the sense of a Word lies in its linking to a Synset (or in general, to a unit of meaning). And then we can reify this relation into a Sense class as there can be many important things to say about it. However, I understand that following ontology modelling conventions, one could expect the ontolex:sense property to link to instances of a Sense class… so open to opinions (and proposals) for this property renaming. Even those from John’s last model could be reasonable.
>>> Cheers,
>>> Armando
>>>  
>>> <image005.png>
>>>  
>>>  
>>>  
>>> From: johnmccrae@gmail.com [mailto:johnmccrae@gmail.com] On Behalf Of John McCrae
>>> Sent: venerdì 19 aprile 2013 10.44
>>> To: Armando Stellato
>>> Cc: Aldo Gangemi; Philipp Cimiano; public-ontolex
>>> Subject: Re: WordNet modelling in Lemon and SKOS
>>>  
>>> Hi,
>>>  
>>> While Aldo's model is very elegant it is not possible to have lexical sense as a subset of skos:Concept for a simple reason: the lexical sense is defined for only a single lexeme, while the skos:Concept can be used for multiple lexemes.
>>>  
>>> For this key reason we need to have a "lexical sense" object that is between the lexical entry and its meaning. If you are uncomfortable with this object then you can view it as a simple reification (although I would contend it is a very real object). In fact this is nothing more than the traditional lexicographic "word sense", see http://en.wikipedia.org/wiki/Word_sense.
>>>  
>>> I rename the "lexical sense" object of Aldo's model to "concept" or following WordNet a "synset"
>>>  
>> 
>> [il messaggio originale non è incluso]
>> 
>> 
>> 
> 
>
Received on Thursday, 25 April 2013 15:00:46 UTC