Re: Meaning and Semiotics - Issues for Modelling

Hi all,

I added Piek's list of features for the lexicon here
http://www.w3.org/community/ontolex/wiki/Specification_of_Requirements/Lexicon-Ontology-Mapping

I think that 1-6 are fairly uncontroversially part of the lexicon, and we
should have an agreement on the modelling here.

7. "probability of packaging" seems to move from Lexicography to some kind
of *quote* meta-lexicography *unquote*... i.e., from "MeSH ID D030361 is
lexicalised as 'HPV'" to "MeSH IDs are frequently lexicalised as
abbreviations"... Not necessarily a bad thing, but it is stretching the
scope of the group (but that said both LMF and lemon have systems for
describing regular inflection, e.g., "English nouns pluralize by adding s"
that also count as "meta-lexicography", and I would argue that falls under
the scope of the group...)
8. Subjectivity and connotation are something I admit we do not really try
to model in lemon <http://www.lemon-model.net> (and I believe are not
handled by LMF)... I am not sure what the modelling would look like here
(but it would be good to see some proposals, *hint*, *hint*....)
9. Not sure what is meant by "social roles"... are you referring to
something like this
http://en.wikipedia.org/wiki/Honorific_speech_in_Japanese?
10. Again not entirely sure what is intended, mostly gender and aspect for
me count as Morphological issues... however perhaps you mean true semantic
distinctions in syntax such as
http://en.wikipedia.org/wiki/Luganda#Noun_classes or
http://en.wikipedia.org/wiki/Chinese_counter? Examples from Japanese

Ni*wa* no hato = 2 doves (birds)
Ni*ppiki* no hotaru = 2 fireflies (small animals)
Ni*ko* no tane = 2 seeds (small objects)
Ni*ppon* no hon = 2 books/scrolls (long objects)
Ni*mai* no sara = 2 plates (flat objects)
*Futari* no onna = 2 women (people)

These form a difficult class, as these distinctions come from the ontology
but affect the syntax, unlike gender in European languages (cf., das Kind =
child (neuter)).

And in response to Anne:

For me, diminutive is a morphosyntactic property and I have included them
under Piek's point 3. This is based on my guess that we likely don't want
to go much deeper than simple properties for modelling here. The
connotative affect of diminutive in Russian is very interesting, but it
seems quite a narrow case... can anyone perhaps situate it in a more
general linguistic context? (i.e., modelling that is only useful for
Russian gansters' names is likely not so pressing ;) ).

Modality and aspects are standard morphosyntactic features, I agree that
they should be modelled (but I would again file them under point 3).

Metaphor and Irony... I think this leads to a long discussion. My opinion
is that lexical entries (words + phrases) have some core meaning(s)*, e.g.,
"fressen" means "to eat (like an animal)", and even if the interpretation
in some example is different in an actual example this can be treated as an
external phenomena to the ontology-lexicon. That means I am not concerned
about "The White House  announced a new policy" even though the "White
House" is a building and can certainly not announce or even speak;
furthermore, I do not feel it is necessary to introduce another sense for
the phrase "The White House" as "the representatives of the U.S.
Government" into the lexicon, but that it is the duty of the NLP system to
make this "leap" by itself. But I would love to hear other opinions...

Regards,
John

* i.e., The distinction is made between systematic and non-systematic
polysemy. See Paul Buitelaar or Wim Peter's work on this....

On Mon, Aug 20, 2012 at 10:55 AM, Anne Schumann <anne.schumann@tilde.lv>wrote:

> Dear Ontolex group,****
>
> ** **
>
> I have been reading this discussion with a lot of interest just after
> returning from my holidays (therefore the late reply). Although I am not
> formally a member of this group, I would like to comment on some of the
> aspects that have been discussed so far, hoping my ideas are not completely
> arbitrary. About myself: I do a Phd in terminology at the University of
> Vienna and therefore have spent some time thinking about most of the issues
> touched upon in this discussion. ****
>
> Although the term “ontology” is used in terminology often just with the
> meaning of “concept system” (without most of the fancy reasoning abilities
> attributed to ontologies in more technical circles), grounding work has
> been carried out already in the 30-ies by an Austrian engineer called Eugen
> Wüster. Since terminological practice deals mainly with specialized
> vocabularies, mainstream approaches are rather successful in neatly
> separating linguistic and conceptual properties, however, in practice,
> problems related to term variation, term evolution,
> normativity/descriptivity, prototypicality etc. (some of these already
> mentioned in the discussion) are evident.****
>
> I do not think that this happens by chance. In fact, as far as I
> understand the scope of the work discussed here, the challenge lies also in
> providing ways to model the nexus between linguistic form and meaning (the
> semiotic triangle mentioned earlier) in a more expressive way than the
> standard approaches that may be appropriate for medical terminology or
> classifying types of steel, but not other areas of language (and for a
> wider range of languages). Therefore, I really liked the list of linguistic
> features provided by Piek Vossen. To his list, I would like to add some
> features that, in my view, lie at the heart of the problem since they are
> mainly pragmatic (?) in nature and therefore not merely “linguistic” (and
> thus, maybe, should not be tucked away in the lexicon), but really connect
> linguistic form and conceptual meaning in a regular fashion:****
>
> ** **
>
> **-          **Diminutives. Some languages employ diminutives as complex
> pragmatic markers (that is, not just as markers of sth. being small), e. g.
> Russian. Some diminutives of Russian names have complex sociolinguistic
> functions (expressing social hierarchies (Anja – Anjechka), affection (Anja
> – Anjuta/Anjutka, Sergej – Serjozha) or even the fact of belonging to a
> criminal organization or being at least gangster-like (Sergej – Serjoga)),
> others, however, seem to be devoid of any specific meaning.****
>
> **-          **Modality and reported speech (subjonctif, congiuntivo). In
> some languages, the expression of modality is grammaticalised, e.g. compare
> Latvian “Tu eji” (you go) to “Tev ir jāiet” (you have to go). In certain
> Russian constructions, on the other hand, the type of modality (you should
> go, you have to go, you can go) is completely opaque. On the other hand,
> romance languages have special paradigms for the expression of reported
> speech (lui avrebbe detto). Since these features, however, are productive,
> it may be reasonable to model them in some way.****
>
> **-          **Metaphoric usage and irony. The German dichotomy of
> “essen” vs. “fressen” leaves ample space for examples (jmd. zum Fressen
> gern haben; du sollst nicht fressen, sondern essen; den hab ich wirklich
> gefressen! ...). I have, at least, some doubts that modeling this on the
> linguistic level (e.g. as different word senses) is the optimal solution
> (according to my intuition, at least, they are not different senses).****
>
> ** **
>
> It has been pointed out that the decisions regarding these issues may be
> taken in a pragmatic way. I hope, however, that my comments were useful.**
> **
>
> ** **
>
> Best regards,****
>
> Anne-Kathrin Schumann****
>
> ** **
>
> ** **
>
> *From:* Piek Vossen [mailto:piek.vossen@vu.nl]
> *Sent:* Sunday, August 12, 2012 4:08 PM
> *To:* Aldo Gangemi
> *Cc:* public-ontolex@w3.org; Guido Vetere; John McCrae
>
> *Subject:* Re: Meaning and Semiotics - Issues for Modelling****
>
> ** **
>
> This is lengthy reply and thanks for the explanations about the tools and
> approach. ****
>
> ** **
>
> Some think any formalization of a lexical property can also be represented
> in the ontology and some think that is at least still debated how. I guess
> that nobody claimed so far that****
>
> a formalization of a linguistic property cannot be expressed in an
> ontology in OWL.****
>
> ** **
>
> But let me try to make the discussion more bullet-wise by drawing up a
> (non-comprehensive) list of typical linguistic features that are normally
> not represented in an ontology:****
>
> ** **
>
> 1. dialect, age-group, formal, informal registers, etc.... ****
>
> -> probably not in OWL****
>
> 2. relations between different meanings (polysemy relations) such a
> specialization, generalization, metonymy, metaphor -> ****
>
>  -> probably not in OWL****
>
> 3. morphological properties****
>
>  -> probably not in OWL****
>
> 4. pronunciation****
>
>  -> probably not in OWL****
>
> 5. syntax:****
>
>  -> probably not in OWL but:****
>
> - verb syntax needs to be mapped to argument structure and argument
> structure map to event structure in the ontology: buy and sell are variants
> with different syntactic mappings to the same event structure. Other
> examples are “teach” and “learn” (in some languages expressed by the same
> verb).****
>
> - Same for countability which is partially semantic and partially a form
> choice.****
>
> 6. collocational contraints, e.g. blow your nose and clear your throat****
>
>  -> probably not in OWL****
>
> 7. probability of packaging: complex concepts are phrased typically as
> e.g. adjective noun, compound, prepositional phrase, verb phrase, etc..***
> *
>
>  -> probably not in OWL****
>
> 8. subjectivity relations and connotations****
>
> -> partly this can be in the ontology reflected in the form of a social
> role but it is (usually) not done****
>
> 9. social roles****
>
> -> this is a border case: it can and probably should be reflected in the
> ontology but it is extremely rich  and complex****
>
> 10. some others: gender in many languages, politeness markers, aspectual
> properties....****
>
> ** **
>
> So there is a lot of work to do to flesh this out and to make a proposal
> that reflects best practice****
>
> ** **
>
> best ****
>
> ** **
>
> Piek****
>
> ** **
>
> ** **
>
> ** **
>
> On Aug 11, 2012, at 8:14 PM, Aldo Gangemi wrote:****
>
>
>
> ****
>
> Hi all, I try a contribution from holidays.****
>
> ** **
>
> I appreciate Piek's attempt to distinguish three aspects of the
> ontology-lexicon modeling. My answers are (1) that any semantic aspect of a
> lexical unit can be formalized (either in crisp or fuzzy logics), (2) that
> modeling any of those aspects can be useful in at least some task, and (3)
> that if we get more concepts from a lexicon, we should be entitled to reuse
> them in order to evolve ontologies. ****
>
> Therefore, I am totally liberal about all flows from lexicons to
> ontologies. I am also ready to any procedure to derive lexicons
> from (impoverished) ontologies.****
>
> ** **
>
> The reason why I am so liberal, is that I fully accept the consequences of
> adopting semiotics, the only theory that is expressive enough to consider
> linguistic and logical semantics as special cases. Let me explain this as
> briefly as possible.****
>
> ** **
>
> The core problem we are discussing is the actual nature of meaning as it
> is represented in NL/lexicons or in ontologies. This problem goes back
> deeply in philosophy, linguistics and logic, as Guido said. However, the
> Semantic Web does not claim to go so deeply. Then semiotics should be
> enough to make sense of the problem. (For those having much time, you might
> read my chapter in Ontology and the Lexicon, Cambridge University Press).*
> ***
>
> ** **
>
> Semiotics assumes three aspects/roles of a "sign" relation that is
> instantiated everytime there is an ongoing linguistic (or generally
> semiotic) activity: the *expression*, the *meaning*, and the *reference*.
> Any linguistic activity involves the use of some expression that typically
> gets a meaning in context, and usually denotes a referenced individual or
> collection.****
>
> ** **
>
> A lot of different things can play one of the three roles: I can use the
> string "buy" as an expression in the utterance "would I buy it again?" or
> in "<buy> has three letters", or as a meaning in the discourse: "what does
> it mean the word <comprare>? To buy", or even as a reference in the
> sentence "Buy is a dual concept to Sell".****
>
> ** **
>
> Now, either lexicons or ontologies contain elements that nicely distribute
> into the roles of a semiotic sign relation. E.g. WordNet has words
> (expressions), word senses and synsets (meanings), and instances
> (references). An ontology has labels (expressions), class and property IDs
> (?meanings?), individuals and facts (references), as well as "formal
> interpretations", e.g. class or property extensions, which would be
> references as well in the semiotic framework. ****
>
> There remain other strange beasts such as comments/glosses/definitions,
> either from lexicons or ontologies, which could be considered  expressions
> to be analyzed, or directly as paraphrastic meanings.****
>
> ** **
>
> At this point, what is that distinguishes ontologies from lexicons? Mainly
> formal interpretation it seems, with all the reasoning machinery
> (set-theoretic, model-theoretic, possible worlds. etc.) that comes with it.
> A very important feature indeed. But if we remove that machinery for a
> second, ontologies are just quite structured lexicons, which is often the
> actual meaning of ontologies assumed by linked data people, who usually
> prefer the term "vocabulary" :). We all know that ontologies as controlled
> vocabularies is a well accepted meaning.****
>
> ** **
>
> My constructive proposal is that a standard that accommodates for any task
> that aggregates lexical and ontological knowledge should be able to express
> *both* the similarities and the differences between lexicons and ontologies.
> ****
>
> ** **
>
> With the thread example, the synset wordnet3:synset-buy-verb-1 can be a
> meaning as it is e.g. the OWL class http://ontosem.org/#buy. It can
> happen to reason with wordnet3:synset-buy-verb-1 as a class if the case
> requires it (e.g. if a WSD is used for ontology learning), as it can happen
> to reason with http://ontosem.org/#buy as a word sense if a
> different case requires it (e.g. if ontology designers discuss the actual
> meaning of the http://ontosem.org/#buy class). I provide here a concrete
> example of the first case.****
>
> ** **
>
> In a recent tool developed at STLab, called Tipalo [1], we derive OWL
> taxonomies from Wikipedia definitions extracted from page abstracts. In
> doing so we apply deep parsing that create a DRT logical model from the NL
> definition, then we produce an OWL model from that, disambiguate class
> names to WordNet senses, and resolve as many individuals as possible to
> DBpedia. For example, if we ask Tipalo to produce an OWL model for the
> Wikipedia entity "Wind instrument" [2], the following definition is
> extracted:****
>
> ** **
>
> "A wind instrument is a musical instrument that contains some type of
> resonator , in which a column of air is set into vibration by the player
> blowing into a mouthpiece set at the end of the resonator."****
>
> ** **
>
> After the parsing is produced, Tipalo extracts the relations that are
> appropriate for a taxonomy, resolves some names to DBpedia entities, and
> disambiguates some to WordNet (by using UKB currently), so asserting
> owl:equivalentClass axioms between classes extracted from the logical
> representation of the text, and WordNet synsets. In OWL2 semantics, this
> makes those synsets regular classes, and formal reasoning is enabled on
> them.****
>
> An OWL model for the example is produced and visualized in the enclosed
> picture:****
>
> ** **
>
> <Wind_instrument.png>****
>
> ** **
>
> In my view then "direct reference" of lexical units to ontology classes is
> fine, provided however that both lexical units and classes can be *equally*
> considered semiotic meanings, and can be made interoperable by doing
> something as simple as what we do in Tipalo. ****
>
> ** **
>
> For Piek, notice that this solution complies with my answers to you
> questions: if an aspect of lexical meaning is useful, integrate it in
> ontology-based models/reasoning, if new meanings are needed/discovered,
> just integrate them..****
>
> ** **
>
> For Guido, the game we play in Senso Comune is a bit special, because by
> "ontology" we really mean a fully axiomatized foundational ontology, and of
> course we want to be careful in distinguishing meaning coming from a
> dictionary like De Mauro's and meaning coming from DOLCE. However, the
> ground similarity between those meanings is there, and nothing prevents us
> (in principle) to introduce in DOLCE a meaning derived from a dictionary
> word sense. Such provenance distinctions about authoritativeness, formal
> axiomatization etc. can be preserved by adding some punning to classes and
> properties :).****
>
> ** **
>
> Ehm, now my message has grown substantially, but time ago I had promised
> to clarify my semiotics.owl pattern, so this is a way of doing it.****
>
> ** **
>
> Ciao****
>
> Aldo****
>
> ** **
>
> ** **
>
> [1] http://wit.istc.cnr.it/stlab-tools/tipalo****
>
> [2] http://en.wikipedia.org/wiki/Wind_instrument****
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> On 11 Aug 2012, at 09:33, Guido Vetere wrote:****
>
>
>
> ****
>
> Piek Vossen <piek.vossen@vu.nl> wrote on 10/08/2012 10.05.38:
>
> > Piek Vossen <piek.vossen@vu.nl>
> > 10/08/2012 10.05
> >
> > To
> >
> > Guido Vetere/Italy/IBM@IBMIT
> >
> > cc
> >
> > <public-ontolex@w3.org>
> >
> > Subject
> >
> > Re: Meaning and Semiotics - Issues for Modelling
> >
> > Dear all,
> >
> > I would like to discuss this at another level. We should first
> > answer the question:
> >
> > 1. Is there any semantic aspect of a word sense (I prefer lexical
> > unit) that cannot be represented in an ontological model?
> >
> > It may not be easy but I think you can, if you allow semantics in
> > the ontology that incorporates probabilities and prototypicality.
> > I think that any formalization of lexical meaning can be turned into
> > an ontological meaning, simply because it is a formalization.
> > if it is not a formalization then the lexical meaning is ill-defined
> > and we need to do more (empirical) work to learn about the word and its
> usage.
> >
>
> As far as we can formalize lexical meanings, we can represent them in a
> formal way, this is true (by definition). But what we can formalize, and
> how, is a very open issue in philosophy of language and logic,
> respectively. Frege and Tarski warned about using formal logic for modeling
> natural language, in vain. As a matter of facts, modern logicians are still
> striving to look at linguistic phenomena under the lens of Truth, which is
> quite problematic in many cases. In fact, we lack of a generally agreed
> (and positive) 'theory of meaning', and I'm afraid this is not a just a
> problem of 'empirical work'. Of course, we cannot solve philosophycal
> puzzles here, but I think that we should take them into account, somehow.
>
> > 2. Do you want to model any semantic aspect that characterizes a
> > word sense also in the ontology?
> >
> > This is another question. If we want to model pure logical
> > reasoning, there may be many lexical aspects (not just the pragmatic
> > knowledge) that we do not need
> > in the ontology. We do not need to represent “buy” and “sell”
> > separately to reason over de financial transaction process.
> >
>
> I agree, for most computational tasks, there would be no need of
> representing any semantic aspect of a word sense, even if it were possible.
>
> > 3. What do we do with the situations that lexicons are far more
> > richer than any ontology available and thus we cannot provide
> > sufficient ontological labels to model the lexicons.
> >
>
> > This is a more practical and pragmatic question. If the lexicon is
> > so large, complex and rich, why not use a two-layered solution where
> > lexical relations take the burden off the ontology and the ontology
> > takes the burden of deeper reasoning (need to define how deep we
> > need to go). So in the lexicon, I can say that one word is the
> > informal word for “eat” and another word is the neutral label for
> > “eat”.. In the ontology, we just have “eat”. Many lexicalized
> > concepts are either pragmatic variants or can be defined using
> intersecting
>
> > properties as described by Philipp for “bald”.
> >
>
> I like this idea of the 'two layers' very much: ontology should allow
> reasoning on real world structures (e.g. parts, phases, ect) while lexica
> should account for linguistic habits and games. By the way, Quine drew a
> line to distinguish 'ontology' (what is there) from 'ideology' (the way we
> conceptualize it through language). Maybe we can start from there ..
>
> Regards,
>
> Guido Vetere
> Manager, Center for Advanced Studies IBM Italia
> _________________________________________________
> Rome                                     Trento
> Via Sciangai 53                       Via Sommarive 18
> 00144 Roma, Italy                   38123 Povo in Trento, Italy
> +39 (0)6 59662137                 +39 (0)461 312312
>
> Mobile: +39 3357454658
> _________________________________________________
>
> IBM Italia S.p.A.
> Sede Legale: Circonvallazione Idroscalo - 20090 Segrate (MI)
> Cap. Soc. euro 347.256.998,80
> C. F. e Reg. Imprese MI 01442240030 - Partita IVA 10914660153
> Società con unico azionista
> Società soggetta all’attività di direzione e coordinamento di
> International Business Machines Corporation
>
> (Salvo che sia diversamente indicato sopra / Unless stated otherwise above)
> ****
>
> ** **
>
> ** **
>
> Aldo Gangemi****
>
> Senior Researcher****
>
> Semantic Technology Lab (STLab)****
>
> Institute for Cognitive Science and Technology,****
>
> National Research Council (ISTC-CNR) ****
>
> Via Nomentana 56, 00161, Roma, Italy ****
>
> Tel: +390644161535****
>
> Fax: +390644161513****
>
> aldo.gangemi@cnr.it****
>
> http://www.stlab.istc.cnr.it****
>
> http://www.istc.cnr.it/people/aldo-gangemi****
>
> skype aldogangemi****
>
> okkam ID: http://www.okkam.org/entity/ok200707031186131660596****
>
> ** **
>
> ** **
>
> Piek Vossen****
>
> Professor Computational Lexicology****
>
>
>
>
> T +31 (0)20 59 86457 *|*  piek.vossen@vu.nl* | *http://www.vossen.info *|*
> ADDRESS: de Boelelaan 1105, 1081 HV Amsterdam, The Netherlands *| *
> Disclaimer<http://www.vu.nl/nl/over-de-vu/vu-website/e-mail-disclaimer/disclaimer-tekst-e-mail/index.asp>
>
>
> ****
>
> ** **
>
>
>
>
> ****
>
> ** **
>

Received on Monday, 20 August 2012 16:17:24 UTC