Re: relations about lexical entries

Dear John, all,


Since we won’t be able to attend Friday’s meeting, we send you some
suggestions regarding the Classification of Relations included in the
“*Specification
of Requirements/Properties-and-Relations-of-Entries*”.

We read John's classification, which is very complete, and we have some
comments and suggestions for discussion/modification.

1.      Regarding *orthographic variants*, we find that the classification
is very exhaustive, and we wonder if it is really needed. Also, what you
name Historical Orthographic variants have been traditionally considered as
Diachronic variants, and the Geo-orthographical variants, Diatopic variants
(as we suggest under Terminological variants). However, we agree that we
could have a property describing the orthography in a more general sense.

 2.      Regarding the classification of *Comparatives and
superlatives*within Single lexeme morphological variation, it is not
always the case
that comparatives and superlatives consist of one lexeme (English: more
difficult; Spanish: más guapo/a). Moreover, if you consider that those
cases are the ones you had in mind, as the adjective is a separate lexeme,
then better, longer, richer, easiest, are bound lexemes.

 3.      Moreover, you include Affixial (shouldn’t it be Affixal?)
Derivation under Multi-lexeme variation, and the examples correspond to
single lexemes.

For these reasons, we would suggest to talk about *morphological variation*or
*affixation* in general, and follow a more canonical definition of affixes
that classifies them into *derivational affixes* (lexicon, lexical,
lexicalize), and *inflexional affixes* (-s for the English plural, or –ed
for past tense), which would include what you call Pluralization and verb
form inflection.

4.      On the other hand, what you call Rephrasing can be understood as an
explicative variant (because rephrase is to explain in other words,
right?), so we would suggest to remove it from under Lexical variants and
include it under Terminological variants, because as we show below there
are more examples.

 5.      And regarding Pleonasm, we find a lot of examples that are
considered pleonasm but cannot be considered lexical variants, such as
“subir arriba”, “bajar abajo”, “entrar dentro” in Spanish.

Therefore, the classification so far would be as follows:

1.      Orthographic variants

a.       Diatopic variants

b.      Diachronic variants

c.       Semantic-orthographic variants

2.      Morphological variants

a.       Affixal variants


i.      Derivational variants


ii.      Inflexional variants

b.      Compounds

c.       Abbreviations (including acronyms, among others. Examples: peer to
peer- P2P; WYSWYG, FAO, UNESCO, etc.)

6. Moving now to the *terminological variants*, we would not make a
distinction between Pragmatic and Circumlocutive variants, but consider
them as terminological variants.

According to the traditional linguistic classification of variants, as
proposed in
http://www.christianlehmann.eu/ling/variation/dimensions_of_variation.html,
from Coseriu, 1981, stylistic variants are called diastratic and depend on
the user; and register variants are termed diafasic variants and depend of
the usage (colloquial usage, familiar language, formal language, and
include professional jargon, which in fact deals with terminology.)

Therefore, the classification of the* terminological variants *would be as
follows:

   1. Terminological variants
      1. Diastratic variants: stylistic or connotative variants (man and
      bloke)
      2. Diachronic variants (tuberculosis and phthisis)
      3. Diatopic variants: dialectal variants (gasoline vs. petrol)
      4. Diaphasic variants: pragmatic or register variants (headache and
      cephalalgia; swine flu and pig flu and H1N1 and Mexic pandemic flu)
      5. Rephrasing variants (immigration law and law for regulating and
      controlling immigration)

Finally, as for the *semantic variants,* we do not see these relations as
being at the same level as the rest of the variants, because we would not
consider hyponyms, for example, as being really “variants”. In this sense,
we would say that terminological variants would be relations between
Lexical Entries in lemon, and the so-called semantic variants would be
relations between Lexical Senses, as would be the case of translations.
(The example given for “Modification” could also be seen as a
Hypernym-hyponym relation). So, maybe, it is as simple as not referring to
them as “variants”, but as lexico-semantic “relations” (or simply
relations).

After suggesting this new classification, we have come across very
interesting examples, and we are not sure where to include them... Do you
want to give it a try? J

General collector vs. Common collector

Electric equipment vs. Electric fittings

Normal running vs. Right working

Safety report vs. safety analysis

Water height vs. level of water

Control panel vs. control board

Total capacity vs. total volume

Limit of solubility vs. solubility threshold

Maybe they should be included under “Terminological variant” in general,
because we do not know what the motivation behind the term was, or because
it is not relevant (it can be simply trendy). Well, in fact, from the
terminological viewpoint, it is sometimes clear that the term pays
attention to some salient feature, (*impresora de margarita, impresora de
tinta, impresora laser,* etc) but we’re not sure whether this is
interesting for this purpose.

Kind regards

Elena & Lupe



2012/8/20 John McCrae <jmccrae@cit-ec.uni-bielefeld.de>

> Hi all,
>
> Can I suggest we merge the following requirements on "Lexical Variant and
> Paraphrases" and "Lexical and linguistic properties of lexical entries"?
>
> My reasoning is that it seems that what Lupe is suggesting relies heavily
> on the definition of properties. i.e., to model geographical variants,
> register variants or diachronic variants, we need to be able to state the
> geographical, register or diachronic properties of the two variants. As
> such *we can think of variation in terms of the properties that vary* and
> those that do not. Put more clearly, variants are entries that are similar
> (have the same property values) except for some property, e.g., translation
> is variation in language, pluralization is variation in number, etc.
>
> Considering the list of variants above, the following properties are
> preserved by the type of variance
>
>    - Orthographic variants. Preserved: Pronounciation, syntax, most
>    pragmatic, semantic properties. Differs: Generally context or geographic
>    usage
>    - Inflectional variants. Preserved: part-of-speech, pragmatic and
>    semantic properties.
>    - Morphosyntactic variants. Preserved: semantic properties, most
>    pragmatic properties.
>    - Stylistic+Register variants. Preserved: semantics.
>    - Diachronic variants. Preserved: semantics
>    - Dialectical variants. Preserved: semantics
>    - Explicative variants. Preserved: extensional semantics (not
>    intensional)
>    - Semantic variants. Preserved: partial semantics
>
> As such I would go for splitting up the categories as follows
>
> *Group 1a. Orthographic variants*
>
>    - Historical Orthographic variants. e.g., different scripts such as
>    for Azeri (http://en.wikipedia.org/wiki/Azerbaijani_alphabet)
>    - Geo-orthographic variants. e.g., "localize" vs. "localise"
>    - Semantic-orthographics variants. e.g., "取る" (toru - "to take (remove
>    from a location)") vs "撮る" (toru - "to take (a photo)")
>
> *Group 1b. Inflectional variants*
>
>    - Pluralization, verb form inflection, comparatives and superlatives
>    - Synthesis (see http://en.wikipedia.org/wiki/Synthetic_language)
>
> *Group 1c. Morphosyntactic Variants*
>
>    - Rephrasing: e.g., "cancer of the mouth" vs. "mouth cancer"
>    - Derivation (e.g., Nominalization): e.g., "lexicon", "lexical",
>    "lexicalize"
>    - Pleonasm: "tuna" vs "tuna fish"
>    - Abbreviation: e.g., AIDS..... Philipp> Any variation has some
>    (sight) pragmatic implication, abbreviation for me is morphosyntactic as
>    the motivation is brevity rather than connotation.
>
> *Group 2a. Pragmatic Variants*
>
>
>    - stylistic or connotative variants (man and bloke)
>    - diachronic variants (tuberculosis and phthisis)
>    - dialectal variants (gasoline vs. petrol)
>    - pragmatic or register variants (headache and cephalalgia; swine flu
>    and pig flu and H1N1 and Mexic pandemic flu)
>
> *Group 2b. Circumlocutive variants*
>
>
>    - explicative variants (immigration law and law for regulating and
>    controlling immigration)
>
> *Group 3. Non-synonymous variants*
>
>    - Modification: "MRSA", vs "hospital-acquired MRSA"
>    - Hypernym/Hyponymy/Antonymy
>    - Cross-lingual narrowing/broadening: "river" vs "rivière/fleuve"
>
>
> Does this sound sensible or did I miss something?
>
> Regards,
> John
>
> On Sun, Aug 5, 2012 at 9:56 PM, lupe aguado <gac280771@gmail.com> wrote:
>
>> Dear Ontolex members
>>
>> With this message we would like to start the discussion about the
>> requirements on “Relations between lexical entries”. I put the message as a
>> draft in the Ontolex community Group and forgot to send it to you. Sorry!
>>
>> In our opinion, two types of relations need to be taken into account in
>> an ontology-lexicon model:
>>
>>    1. *relations between labels in different natural languages,* and
>>    2. *relations between labels within the same natural language.*
>>
>> Before continuing, we would like to define the two scenarios that we
>> envisage:
>>
>>    1. *A.      **Multilingual labeling approach*
>>
>> In a multilingual labeling approach, we have a single conceptual
>> structure, and we provide alternative labeling information in the
>> ontology-lexicon model for each of the languages covered (in the same
>> language or in different languages). This is possible whenever the
>> languages covered share a single view on a certain domain. In this case,
>> there will always be one or several labels in each natural language for
>> naming or terming the concepts in the ontology.
>>
>>    1. *B.      **Cross-lingual linking or mapping approach*
>>
>> In this second scenario, there exist two independent monolingual
>> ontologies, defined in different languages, but covering the same or
>> similar subject domain. We aim at establishing links between the labels
>> that describe the two ontologies. The establishment of these cross-lingual
>> links could derive in cross-lingual ontology mappings. In this scenario,
>> the conceptual structure of each ontology is modeled independently, and
>> “linguistic links” or “mappings” can be established between the two.
>>
>> ---------
>>
>> Now, in a *multilingual labeling approach*, we will usually refer to
>> “cross-lingual equivalents”.  Let us take for example an ontology of
>> medical conditions. In such an ontology we can find terms such as menopause
>> in English, and its cross-lingual equivalents: menopause in French,
>> menopause in Danish, vaihdevuodet in Finnish or Menopause in German. This
>> means that the “same” concept exists in the involved cultures and has an
>> equivalent term in the corresponding language.
>>
>> On the contrary, in a *cross-lingual linking or mapping approach*, we
>> could come across several types of relations among lexical entries due to
>> the following reasons:
>>
>>    - conceptualization mismatches
>>    - different levels of granularity
>>
>> In fact, granularity or viewpoint differences may also come up in a
>> “monolingual” linking or mapping approach. However, conceptualization
>> mismatches will be more common in a cross-lingual scenario. In this sense,
>> we could account for several types of relations
>>
>> 1.            *Cross-lingual equivalence relations*, as in the
>> multilingual labeling scenario. These would establish a relation between
>> concepts that are not exactly the same (do not have the same intension
>> and/or extension), but are close equivalents, because no exact equivalent
>> exists.  Example: full professor in English – catedrático in Spanish –
>> Professor in German. In order to distinguish them from the cross-lingual
>> equivalents in the multilingual labeling scenario, we could term them: *cross-lingual
>> close equivalents*? *Cross-lingual near equivalents*? Suggestions are
>> welcome!!
>>
>> 2.            *Cross-lingual broad (narrow) equivalence relations*.
>> These would establish a relation between concepts with different levels of
>> granularity. This usually happens when one culture understands a concept or
>> phenomenon with a higher granularity than the other, i.e., one culture has
>> two or more concepts (and in its turn, terms for naming them) to describe
>> the same phenomenon. Example: river in English – rivière and fleuve in
>> French; Tötung in German – asesinato and homicidio in Spanish. Here again,
>> suggestions for better examples are welcome.
>>
>> In the case no equivalent exists, we could still provide a term or
>> description, using for this a mixed scenario, i.e., providing some labels
>> or lexical entries for the concept we do not find an equivalent term in the
>> other ontology, as in the multilingual labeling approach. For this, we
>> consider two options:
>>
>> 3.            *Literal translation relations*. These are translations of
>> terms that describe concepts that do not exist in the target language, and
>> for which a literal or “word for word translation” is provided so that the
>> concept is understood by the target language. Example: École normal in
>> French– (French) Normal School in English; Presidente del Gobierno in
>> Spanish – President of the Government in English.
>>
>> 4.            *Descriptive translation relations.* These are
>> translations of terms that describe concepts that do not exist in the
>> target language, and for which a description or definition (and not a term)
>> is provided in the target language. Example: Panetone in Italian – bizcocho
>> italiano que se consume en Nochevieja in Spanish. In this case, we could
>> also opt for repeating the Italian Word plus the gloss.
>>
>> In the latter two cases, we could also provide a link to the closest
>> equivalent or superclass (by means of the cross-lingual broad equivalence
>> relation), and additionally provide a literal or descriptive translation.
>>
>> -------
>>
>> As for the *relations* *between labels within the same language*, we
>> propose to talk about “term variation”.  For example:  what is the
>> difference between Advertising and Publicity, if any? And between
>> Contamination and Pollution?, or between Assisted conception, Artificial
>> insemination and in vitro Fertilization? In a SKOS Thesaurus, Assisted
>> conception is the main label, and the rest are alternative labels. However,
>> we think that we could be more specific regarding the type of variants
>> pointing to one and the same concept in the ontology, and that this should
>> be accounted for in our ontology-lexicon model. Sometimes, the difference
>> is a consequence of the contextual (pragmatic) usage, and we have to decide
>> whether to represent this in our model.
>>
>> Based on previous classifications of terminology variation, we have
>> identified three main groups of term variants that include the following
>> types (see also [1] and [2]):
>>
>> *Group 1*. Synomyms or terminological units that totally correspond to
>> the same concept:
>>
>>    - graphical and orthographical variants (*localization *and*localisation
>>    *);
>>    - inflectional variants (*cat* and *cats*);
>>    - morphosyntactic variants (*nitrogen fixation* and *fixation of
>>    nitrogen*).
>>
>> *Group 2*. Partial synonyms or terminological units that highlight
>> different aspects of the same concept:
>>
>>    - stylistic or connotative variants (*man* and *bloke*)
>>    - diachronic variants (*tuberculosis* and *phthisis*)
>>    - dialectal variants (*gasoline* vs. *petrol*)
>>    - pragmatic or register variants (*headache* and *cephalalgia*; *swine
>>    flu* and *pig flu* and *H1N1* and *Mexic pandemic flu*)
>>    - explicative variants (*immigration law* and *law for regulating and
>>    controlling immigration*)
>>
>> So, we would be very grateful for your suggestions and comments on this
>> proposal.
>>
>> Best regards,
>>
>> Lupe and Elena
>>
>> [1] Montiel-Ponsoda, E., Aguado de Cea, G., McCrae, J. (2011).
>> Representing term variation in *lemon*. In Proceedings of the *WS
>> 2Ontology and lexicon: new insights, TIA 2011 - 9th International
>> Conference on Terminology and Artificial Intelligence*, pp. 47–50.
>>
>> [2] Aguado de Cea, G., and Montiel-Ponsoda, E. (2012).  Term variants in
>> ontologies. In Proceedings of the AESLA (*Asociación Española de
>> Lingüística Aplicada*) Conference.
>>
>>
>>
>> 2012/7/18 Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
>>
>>> Dear all,
>>>
>>>  and just to clarify what the description of the requirements should
>>> include:
>>>
>>> Under "Description", there should be a general description of the
>>> requirement, its implications, etc. It is important that we think here in
>>> terms of requirements on the general model, not on particular data
>>> categories, properties, etc. but on requirements at the meta-model level.
>>>
>>> Under "Relevant Use Cases": here we should just list the IDs of the use
>>> cases touched by this requirement. Maybe this should be called "Affected
>>> Use Cases" ???
>>>
>>> "Relation to Use Case": here we should give detailed examples from the
>>> use cases where the requirement is important, thus grounding our
>>> requirements in the use cases we have collected.
>>>
>>> If there are any questions on this, just shoot.
>>>
>>> Best regards,
>>>
>>> Philipp.
>>>
>>>
>>>
>>> Am 18.07.12 14:24, schrieb Philipp Cimiano:
>>>
>>>  Dear ontolex members,
>>>>
>>>>  during our last meeting on the 6th of July, we discussed my condensed
>>>> list of requirements on the model and agreed that it looks promising to
>>>> work on the basis of these from now on.
>>>>
>>>> See here: http://www.w3.org/community/**ontolex/wiki/Specification_of_*
>>>> *Requirements<http://www.w3.org/community/ontolex/wiki/Specification_of_Requirements>
>>>>
>>>> The older list of unstructured requirements is linked from the bottom
>>>> of the page.
>>>>
>>>> We fixed the following responsibles to produce a first draft of the
>>>> requirement and kick-off the discussions on this mailinglist. (We really
>>>> need to start the discussion on the relevant issues!)
>>>>
>>>> - Express Meaning with respect to ontology: John/Philipp/Aldo/Guido
>>>> - Lexical Variation and Paraphrases: Philipp
>>>> - Relation between lexical entries: Lupe/Elena
>>>> - Lexical and linguistic properties of lexical entries: John/Philipp
>>>> - Valence and Ontological Mapping: John/Philipp
>>>> - High-Order Predicate Mapping: John/Philipp
>>>> - Lexico-Syntactic Patterns: Elena/Dagmar
>>>> - Metadata about lexicon: Armando
>>>> - Modelling lexical resources: John/Aldo
>>>>
>>>> The goal would be to have a detailed specification and an ongoing
>>>> discussion on this mailinglist by end of August.
>>>>
>>>> The next teleconference will be on September 6th, 15:00 - 17:00 (CET).
>>>> It will be two hours as we decided to skip the one in August due to holiday
>>>> period.
>>>>
>>>> We also decided to have biweekly teleconferences from September on. I
>>>> think it is important to keep things moving quickly. Otherwise I have the
>>>> feeling that not much happens in between our monthly teleconferences.
>>>>
>>>> I am now on holidays for two weeks and will then start working on the
>>>> requirements assigned to me.
>>>> Needless to say, everyone should feel free to start working on their
>>>> requirements as soon as possible.
>>>>
>>>> If you think that an important requirement is missing, please post it
>>>> on the list and we will discuss it.
>>>>
>>>> Best regards,
>>>>
>>>> Philipp.
>>>>
>>>>
>>>
>>> --
>>> Prof. Dr. Philipp Cimiano
>>> Semantic Computing Group
>>> Excellence Cluster - Cognitive Interaction Technology (CITEC)
>>> University of Bielefeld
>>>
>>> Phone: +49 521 106 12249
>>> Fax: +49 521 106 12412
>>> Mail: cimiano@cit-ec.uni-bielefeld.**de<cimiano@cit-ec.uni-bielefeld.de>
>>>
>>> Room H-127
>>> Morgenbreede 39
>>> 33615 Bielefeld
>>>
>>>
>>>
>>
>

Received on Wednesday, 10 October 2012 17:06:11 UTC