Re: relations about lexical entries

Hi John,


> Hi Harry,
(Actually, I'm Lars :-)).


>
> The distinction is motivated as follows: *1b *consists of variation 
> within a lexical entry, and *1c *variation with multiple lexical entries,
Thank you for the elaboration. This is the common definition of the 
distinction between inflectional and derivational morphology: 
Inflectional morphology gives you forms (or relates forms, depending on 
your view of the semantics of the description language) of the same 
lexical entry, while derivational morphology makes lexical entries out 
of other lexical entries (or relates lexical entries). It makes perfect 
sense to distinguish the two in lexical description.

But then this is not merely a confusing way of expressing this, but 
incorrect:

>
> *Group 1b. Inflectional Variants*
>
>   * Inflection
>   * Fusional Synthesis (see
>     http://en.wikipedia.org/wiki/Fusional_language)
>
> *Group 1c. Morphosyntactic Variants*
>
>   * Agglutinative variants
>   * Polysynthetic variants (??)
>   * etc.
>

I propose that we relabel these two groups "1b. Inflectional (variants)" 
and "1c. Derivational (variants)" and remove the subcategories (for now).

In brief, this is why:

Both inflection and derivation (like most things in language) have a 
content side (a semantics; lexical or grammatical meanings; grammatical 
functions/categories) and a form side (how is a particular 
meaning/category expressed, by a prefix or a suffix, by a morphophonemic 
alternation like metaphony [Umlaut], by a tonal alternation, by 
truncation, etc.). However, the distinction between inflectional and 
derivational morphology can in the general case be made only on the 
content side. As far as I know, and if we look at it 
cross-linguistically, there are no word-building devices that are 
exclusive to the one or the other (in individual languages there may 
well be, as in English, with no inflectional, but only derivational, 
prefixes). Except for "inflection", the listed subcategories are all 
form-side distinctions: Distinguishing inflection and agglutinative 
variants is like distinguishing round things and red things.

Also, as I said in my previous email, "inflectional (variants)" and 
"morphosyntactic (variants)" are normally considered synonyms.

One reason that one should think carefully before trying to supply 
general subcategories under "Inflectional (variants)" and "Derivational 
(variants)" is this:

There are two stipulations involved here:
(1) The important distinction between inflectional and derivational 
morphology is whether or not more than one lexical entry is involved.
(2) A lexical entry (in a semasiologically organized lexicon) has one 
and only one part of speech. (This can be different in an 
onomasiological lexicon, but e.g. LMF is prejudiced in the direction of 
the semasiological kind.)

It is important to keep in mind that these are stipulations about the 
representation language, not necessarily always corresponding neatly to 
the empirical facts about that represented. This is one reason for some 
well-known clashes between criteria for determining part of speech of a 
form or group of forms. The same content can be (POS-preserving) 
inflection in one linguistic tradition and (possibly POS-changing) 
derivation in another, or even both:

"Participles have the full inflection for number and case of nominal 
words. They can be considered either inflectional forms or derived 
words. As derived words they are treated as deverbal adjectives but as 
inflected forms as a part of the verb inflection." (Iso suomen kielioppi 
[Large Finnish grammar], p. 487: 
<http://scripta.kotus.fi/visk/sisallys.php?p=490>, my translation).

In practice, inflectional and derivational categories in any individual 
language are determined by some kind of prototype reasoning, involving 
considerations like "is it completely productive?", "does it bring with 
it an added layer of morphology and/or different syntactic behavior?", 
"how is this traditionally described in older works on this language or 
in works on related languages?", etc. Sometimes these lead to 
conflicting answers, and then the linguists either leave it at that (as 
in the Finnish reference grammar), or make a decision on sometimes less 
than completely clear grounds.

> I think the ontology-lexicon model shouldn't care which distinction is 
> made but must change its modelling accordingly to whether we are 
> talking about inter-entry and intra-entry relations.

I agree completely, but the content aspect and the form aspect should be 
modeled separately, and most of the terms that appear under headings 1b 
and 1c refer to the form-side aspects of morphology, which do not in the 
general case serve to distinguish inter-entry and intra-entry relations.
>
> NB I assume all derivation leads to new lexical entries and so is 
> "inter-entry"

As I said, this is so by stipulation.

Best
Lars

>
> On Tue, Aug 21, 2012 at 6:44 AM, Lars Borin <lars.borin@svenska.gu.se 
> <mailto:lars.borin@svenska.gu.se>> wrote:
>
>     Dear John and others,
>
>     As someone with a background in linguistic morphology (although
>     it's been a while now) and computational lexicography (more
>     recently), I feel that the proposed distinction between 1b and 1c
>     will be very hard to make in an unambiguous way. Partly this may
>     be due to the terminology, which is different from what I am used
>     to from linguistics, where "morphosyntactic" properties are
>     expressed by "inflectional" morphology, and "synthesis" is a
>     technique (a way of assemling words) which in principle is neutral
>     wrt the distinction between inflectional and derivational/lexical
>     morphology. Since in the general case the boundary between
>     inflectional and derivational/lexical morphology is far from
>     clear, it will probably be more effort than it's worth to keep
>     categories 1b and 1c separate. Individual languages (if they have
>     a codified grammatical tradition, which most languages don't) will
>     often have determined on more or leass clear grounds which
>     morphological categories belong under which label (inflection or
>     derivation)l, but cross-linguistically there are at the most
>     tendencies, so that, e.g., number if expressed morphologically
>     will tend to be an inflectional category.
>
>     The inclusion of inflectional and derivational morphology in the
>     list also raises the vexed question of the nature of grammatical
>     meaning. The use of "variant" for the number or case or other
>     inflectional distinction implies that "in my houses" (as in
>     Finnish inflected form _taloissani_ 'house plural inessive 1st
>     person singular possessive') means more or less the same thing as
>     "house", or am I reading to much into this? How about derivations
>     like, e.g., "doer", "detainee", "eatery"? In other words: When you
>     say "preserved semantic properties", is this also taken to imply
>     "no added semantic properties"?
>
>     Best
>     Lars Borin
>
>
>     2012-08-20 18:34, John McCrae skrev:
>>     Hi all,
>>
>>     Can I suggest we merge the following requirements on "Lexical
>>     Variant and Paraphrases" and "Lexical and linguistic properties
>>     of lexical entries"?
>>
>>     My reasoning is that it seems that what Lupe is suggesting relies
>>     heavily on the definition of properties. i.e., to model
>>     geographical variants, register variants or diachronic variants,
>>     we need to be able to state the geographical, register or
>>     diachronic properties of the two variants. As such _we can think
>>     of variation in terms of the properties that vary_ and those that
>>     do not. Put more clearly, variants are entries that are similar
>>     (have the same property values) except for some property, e.g.,
>>     translation is variation in language, pluralization is variation
>>     in number, etc.
>>
>>     Considering the list of variants above, the following properties
>>     are preserved by the type of variance
>>
>>       * Orthographic variants. Preserved: Pronounciation, syntax,
>>         most pragmatic, semantic properties. Differs: Generally
>>         context or geographic usage
>>       * Inflectional variants. Preserved: part-of-speech, pragmatic
>>         and semantic properties.
>>       * Morphosyntactic variants. Preserved: semantic properties,
>>         most pragmatic properties.
>>       * Stylistic+Register variants. Preserved: semantics.
>>       * Diachronic variants. Preserved: semantics
>>       * Dialectical variants. Preserved: semantics
>>       * Explicative variants. Preserved: extensional semantics (not
>>         intensional)
>>       * Semantic variants. Preserved: partial semantics
>>
>>     As such I would go for splitting up the categories as follows
>>
>>     *Group 1a. Orthographic variants*
>>
>>       * Historical Orthographic variants. e.g., different scripts
>>         such as for Azeri
>>         (http://en.wikipedia.org/wiki/Azerbaijani_alphabet)
>>       * Geo-orthographic variants. e.g., "localize" vs. "localise"
>>       * Semantic-orthographics variants. e.g., "取る" (toru - "to
>>         take (remove from a location)") vs "撮る" (toru - "to take (a
>>         photo)")
>>
>>     *Group 1b. Inflectional variants*
>>
>>       * Pluralization, verb form inflection, comparatives and
>>         superlatives
>>       * Synthesis (see http://en.wikipedia.org/wiki/Synthetic_language)
>>
>>     *Group 1c. Morphosyntactic Variants*
>>
>>       * Rephrasing: e.g., "cancer of the mouth" vs. "mouth cancer"
>>       * Derivation (e.g., Nominalization): e.g., "lexicon",
>>         "lexical", "lexicalize"
>>       * Pleonasm: "tuna" vs "tuna fish"
>>       * Abbreviation: e.g., AIDS..... Philipp> Any variation has some
>>         (sight) pragmatic implication, abbreviation for me is
>>         morphosyntactic as the motivation is brevity rather than
>>         connotation.
>>
>>     *Group 2a. Pragmatic Variants*
>>
>>       * stylistic or connotative variants (man and bloke)
>>       * diachronic variants (tuberculosis and phthisis)
>>       * dialectal variants (gasoline vs. petrol)
>>       * pragmatic or register variants (headache and cephalalgia;
>>         swine flu and pig flu and H1N1 and Mexic pandemic flu)
>>
>>     *Group 2b. Circumlocutive variants*
>>
>>       * explicative variants (immigration law and law for regulating
>>         and controlling immigration)
>>
>>     *Group 3. Non-synonymous variants*
>>
>>       * Modification: "MRSA", vs "hospital-acquired MRSA"
>>       * Hypernym/Hyponymy/Antonymy
>>       * Cross-lingual narrowing/broadening: "river" vs "rivière/fleuve"
>>
>>
>>     Does this sound sensible or did I miss something?
>>
>>     Regards,
>>     John
>>
>>     On Sun, Aug 5, 2012 at 9:56 PM, lupe aguado <gac280771@gmail.com
>>     <mailto:gac280771@gmail.com>> wrote:
>>
>>         Dear Ontolex members
>>
>>         With this message we would like to start the discussion about
>>         the requirements on “Relations between lexical entries”. I
>>         put the message as a draft in the Ontolex community Group and
>>         forgot to send it to you. Sorry!
>>
>>         In our opinion, two types of relations need to be taken into
>>         account in an ontology-lexicon model:
>>
>>          1. *relations between labels in different natural
>>             languages,* and
>>          2. *relations between labels within the same natural language.*
>>
>>         Before continuing, we would like to define the two scenarios
>>         that we envisage:
>>
>>          1. *A. **Multilingual labeling approach*
>>
>>         In a multilingual labeling approach, we have a single
>>         conceptual structure, and we provide alternative labeling
>>         information in the ontology-lexicon model for each of the
>>         languages covered (in the same language or in different
>>         languages). This is possible whenever the languages covered
>>         share a single view on a certain domain. In this case, there
>>         will always be one or several labels in each natural language
>>         for naming or terming the concepts in the ontology.
>>
>>          1. *B. **Cross-lingual linking or mapping approach*
>>
>>         In this second scenario, there exist two independent
>>         monolingual ontologies, defined in different languages, but
>>         covering the same or similar subject domain. We aim at
>>         establishing links between the labels that describe the two
>>         ontologies. The establishment of these cross-lingual links
>>         could derive in cross-lingual ontology mappings. In this
>>         scenario, the conceptual structure of each ontology is
>>         modeled independently, and “linguistic links” or “mappings”
>>         can be established between the two.
>>
>>         ---------
>>
>>         Now, in a *multilingual labeling approach*, we will usually
>>         refer to “cross-lingual equivalents”.  Let us take for
>>         example an ontology of medical conditions. In such an
>>         ontology we can find terms such as menopause in English, and
>>         its cross-lingual equivalents: menopause in French, menopause
>>         in Danish, vaihdevuodet in Finnish or Menopause in German.
>>         This means that the “same” concept exists in the involved
>>         cultures and has an equivalent term in the corresponding
>>         language.
>>
>>         On the contrary, in a *cross-lingual linking or mapping
>>         approach*, we could come across several types of relations
>>         among lexical entries due to the following reasons:
>>
>>           * conceptualization mismatches
>>           * different levels of granularity
>>
>>         In fact, granularity or viewpoint differences may also come
>>         up in a “monolingual” linking or mapping approach. However,
>>         conceptualization mismatches will be more common in a
>>         cross-lingual scenario. In this sense, we could account for
>>         several types of relations
>>
>>         1. *Cross-lingual equivalence relations*, as in the
>>         multilingual labeling scenario. These would establish a
>>         relation between concepts that are not exactly the same (do
>>         not have the same intension and/or extension), but are close
>>         equivalents, because no exact equivalent exists. Example:
>>         full professor in English – catedrático in Spanish –
>>         Professor in German. In order to distinguish them from the
>>         cross-lingual equivalents in the multilingual labeling
>>         scenario, we could term them: *cross-lingual close
>>         equivalents*? *Cross-lingual near equivalents*? Suggestions
>>         are welcome!!
>>
>>         2. *Cross-lingual broad (narrow) equivalence relations*.
>>         These would establish a relation between concepts with
>>         different levels of granularity. This usually happens when
>>         one culture understands a concept or phenomenon with a higher
>>         granularity than the other, i.e., one culture has two or more
>>         concepts (and in its turn, terms for naming them) to describe
>>         the same phenomenon. Example: river in English – rivière and
>>         fleuve in French; Tötung in German – asesinato and homicidio
>>         in Spanish. Here again, suggestions for better examples are
>>         welcome.
>>
>>         In the case no equivalent exists, we could still provide a
>>         term or description, using for this a mixed scenario, i.e.,
>>         providing some labels or lexical entries for the concept we
>>         do not find an equivalent term in the other ontology, as in
>>         the multilingual labeling approach. For this, we consider two
>>         options:
>>
>>         3. *Literal translation relations*. These are translations of
>>         terms that describe concepts that do not exist in the target
>>         language, and for which a literal or “word for word
>>         translation” is provided so that the concept is understood by
>>         the target language. Example: École normal in French–
>>         (French) Normal School in English; Presidente del Gobierno in
>>         Spanish – President of the Government in English.
>>
>>         4. *Descriptive translation relations.* These are
>>         translations of terms that describe concepts that do not
>>         exist in the target language, and for which a description or
>>         definition (and not a term) is provided in the target
>>         language. Example: Panetone in Italian – bizcocho italiano
>>         que se consume en Nochevieja in Spanish. In this case, we
>>         could also opt for repeating the Italian Word plus the gloss.
>>
>>         In the latter two cases, we could also provide a link to the
>>         closest equivalent or superclass (by means of the
>>         cross-lingual broad equivalence relation), and additionally
>>         provide a literal or descriptive translation.
>>
>>         -------
>>
>>         As for the *relations* *between labels within the same
>>         language*, we propose to talk about “term variation”.  For
>>         example:  what is the difference between Advertising and
>>         Publicity, if any? And between Contamination and Pollution?,
>>         or between Assisted conception, Artificial insemination and
>>         in vitro Fertilization? In a SKOS Thesaurus, Assisted
>>         conception is the main label, and the rest are alternative
>>         labels. However, we think that we could be more specific
>>         regarding the type of variants pointing to one and the same
>>         concept in the ontology, and that this should be accounted
>>         for in our ontology-lexicon model. Sometimes, the difference
>>         is a consequence of the contextual (pragmatic) usage, and we
>>         have to decide whether to represent this in our model.
>>
>>         Based on previous classifications of terminology variation,
>>         we have identified three main groups of term variants that
>>         include the following types (see also [1] and [2]):
>>
>>         *Group 1*. Synomyms or terminological units that totally
>>         correspond to the same concept:
>>
>>           * graphical and orthographical variants (/localization
>>             /and/localisation/);
>>           * inflectional variants (/cat/ and /cats/);
>>           * morphosyntactic variants (/nitrogen fixation/ and
>>             /fixation of nitrogen/).
>>
>>         *Group 2*. Partial synonyms or terminological units that
>>         highlight different aspects of the same concept:
>>
>>           * stylistic or connotative variants (/man/ and /bloke/)
>>           * diachronic variants (/tuberculosis/ and /phthisis/)
>>           * dialectal variants (/gasoline/ vs. /petrol/)
>>           * pragmatic or register variants (/headache/ and
>>             /cephalalgia/; /swine flu/ and /pig flu/ and /H1N1/ and
>>             /Mexic pandemic flu/)
>>           * explicative variants (/immigration law/ and /law for
>>             regulating and controlling immigration/)
>>
>>         So, we would be very grateful for your suggestions and
>>         comments on this proposal.
>>
>>         Best regards,
>>
>>         Lupe and Elena
>>
>>         [1] Montiel-Ponsoda, E., Aguado de Cea, G., McCrae, J.
>>         (2011). Representing term variation in /lemon/. In
>>         Proceedings of the /WS 2Ontology and lexicon: new insights,
>>         TIA 2011 - 9th International Conference on Terminology and
>>         Artificial Intelligence/, pp. 47–50.
>>
>>         [2] Aguado de Cea, G., and Montiel-Ponsoda, E. (2012).  Term
>>         variants in ontologies. In Proceedings of the AESLA
>>         (/Asociación Española de Lingüística Aplicada/) Conference.
>>
>>
>>
>>
>>         2012/7/18 Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de
>>         <mailto:cimiano@cit-ec.uni-bielefeld.de>>
>>
>>             Dear all,
>>
>>              and just to clarify what the description of the
>>             requirements should include:
>>
>>             Under "Description", there should be a general
>>             description of the requirement, its implications, etc. It
>>             is important that we think here in terms of requirements
>>             on the general model, not on particular data categories,
>>             properties, etc. but on requirements at the meta-model level.
>>
>>             Under "Relevant Use Cases": here we should just list the
>>             IDs of the use cases touched by this requirement. Maybe
>>             this should be called "Affected Use Cases" ???
>>
>>             "Relation to Use Case": here we should give detailed
>>             examples from the use cases where the requirement is
>>             important, thus grounding our requirements in the use
>>             cases we have collected.
>>
>>             If there are any questions on this, just shoot.
>>
>>             Best regards,
>>
>>             Philipp.
>>
>>
>>
>>             Am 18.07.12 14:24, schrieb Philipp Cimiano:
>>
>>                 Dear ontolex members,
>>
>>                  during our last meeting on the 6th of July, we
>>                 discussed my condensed list of requirements on the
>>                 model and agreed that it looks promising to work on
>>                 the basis of these from now on.
>>
>>                 See here:
>>                 http://www.w3.org/community/ontolex/wiki/Specification_of_Requirements
>>
>>                 The older list of unstructured requirements is linked
>>                 from the bottom of the page.
>>
>>                 We fixed the following responsibles to produce a
>>                 first draft of the requirement and kick-off the
>>                 discussions on this mailinglist. (We really need to
>>                 start the discussion on the relevant issues!)
>>
>>                 - Express Meaning with respect to ontology:
>>                 John/Philipp/Aldo/Guido
>>                 - Lexical Variation and Paraphrases: Philipp
>>                 - Relation between lexical entries: Lupe/Elena
>>                 - Lexical and linguistic properties of lexical
>>                 entries: John/Philipp
>>                 - Valence and Ontological Mapping: John/Philipp
>>                 - High-Order Predicate Mapping: John/Philipp
>>                 - Lexico-Syntactic Patterns: Elena/Dagmar
>>                 - Metadata about lexicon: Armando
>>                 - Modelling lexical resources: John/Aldo
>>
>>                 The goal would be to have a detailed specification
>>                 and an ongoing discussion on this mailinglist by end
>>                 of August.
>>
>>                 The next teleconference will be on September 6th,
>>                 15:00 - 17:00 (CET). It will be two hours as we
>>                 decided to skip the one in August due to holiday period.
>>
>>                 We also decided to have biweekly teleconferences from
>>                 September on. I think it is important to keep things
>>                 moving quickly. Otherwise I have the feeling that not
>>                 much happens in between our monthly teleconferences.
>>
>>                 I am now on holidays for two weeks and will then
>>                 start working on the requirements assigned to me.
>>                 Needless to say, everyone should feel free to start
>>                 working on their requirements as soon as possible.
>>
>>                 If you think that an important requirement is
>>                 missing, please post it on the list and we will
>>                 discuss it.
>>
>>                 Best regards,
>>
>>                 Philipp.
>>
>>
>>
>>             -- 
>>             Prof. Dr. Philipp Cimiano
>>             Semantic Computing Group
>>             Excellence Cluster - Cognitive Interaction Technology (CITEC)
>>             University of Bielefeld
>>
>>             Phone: +49 521 106 12249 <tel:%2B49%20521%20106%2012249>
>>             Fax: +49 521 106 12412 <tel:%2B49%20521%20106%2012412>
>>             Mail: cimiano@cit-ec.uni-bielefeld.de
>>             <mailto:cimiano@cit-ec.uni-bielefeld.de>
>>
>>             Room H-127
>>             Morgenbreede 39
>>             33615 Bielefeld
>>
>>
>>
>>
>
>     -- 
>     «Null hull,» sa Harry    | – Bögga? sagði Erlendur. Er það orð?
>     (Jo Nesbø: Kakerlakkene) | (Arnaldur Indriðason: Mýrin)
>     --
>     Lars Borin
>     Språkbanken • Centre for Language Technology
>     Institutionen för svenska språket
>     Göteborgs universitet
>     Box 200
>     SE-405 30 Göteborg
>     Sweden
>
>     office+46 (0)31 786 4544  <tel:%2B46%20%280%2931%20786%204544>
>     mobile+46 (0)70 747 8386  <tel:%2B46%20%280%2970%20747%208386>
>
>     <http://språkbanken.gu.se/personal/lars/>  <http://spr%C3%A5kbanken.gu.se/personal/lars/>
>
>

-- 
«Null hull,» sa Harry    | – Bögga? sagði Erlendur. Er það orð?
(Jo Nesbø: Kakerlakkene) | (Arnaldur Indriðason: Mýrin)
--
Lars Borin
Språkbanken • Centre for Language Technology
Institutionen för svenska språket
Göteborgs universitet
Box 200
SE-405 30 Göteborg
Sweden

office +46 (0)31 786 4544
mobile +46 (0)70 747 8386

<http://språkbanken.gu.se/personal/lars/>

Received on Thursday, 23 August 2012 08:06:28 UTC