relations about lexical entries

Dear Ontolex members

With this message we would like to start the discussion about the
requirements on “Relations between lexical entries”. I put the message as a
draft in the Ontolex community Group and forgot to send it to you. Sorry!

In our opinion, two types of relations need to be taken into account in an
ontology-lexicon model:

   1. *relations between labels in different natural languages,* and
   2. *relations between labels within the same natural language.*

Before continuing, we would like to define the two scenarios that we
envisage:

   1. *A.      **Multilingual labeling approach*

In a multilingual labeling approach, we have a single conceptual structure,
and we provide alternative labeling information in the ontology-lexicon
model for each of the languages covered (in the same language or in
different languages). This is possible whenever the languages covered share
a single view on a certain domain. In this case, there will always be one
or several labels in each natural language for naming or terming the
concepts in the ontology.

   1. *B.      **Cross-lingual linking or mapping approach*

In this second scenario, there exist two independent monolingual
ontologies, defined in different languages, but covering the same or
similar subject domain. We aim at establishing links between the labels
that describe the two ontologies. The establishment of these cross-lingual
links could derive in cross-lingual ontology mappings. In this scenario,
the conceptual structure of each ontology is modeled independently, and
“linguistic links” or “mappings” can be established between the two.

---------

Now, in a *multilingual labeling approach*, we will usually refer to
“cross-lingual equivalents”.  Let us take for example an ontology of
medical conditions. In such an ontology we can find terms such as menopause
in English, and its cross-lingual equivalents: menopause in French,
menopause in Danish, vaihdevuodet in Finnish or Menopause in German. This
means that the “same” concept exists in the involved cultures and has an
equivalent term in the corresponding language.

On the contrary, in a *cross-lingual linking or mapping approach*, we could
come across several types of relations among lexical entries due to the
following reasons:

   - conceptualization mismatches
   - different levels of granularity

In fact, granularity or viewpoint differences may also come up in a
“monolingual” linking or mapping approach. However, conceptualization
mismatches will be more common in a cross-lingual scenario. In this sense,
we could account for several types of relations

1.            *Cross-lingual equivalence relations*, as in the multilingual
labeling scenario. These would establish a relation between concepts that
are not exactly the same (do not have the same intension and/or extension),
but are close equivalents, because no exact equivalent exists.  Example:
full professor in English – catedrático in Spanish – Professor in German.
In order to distinguish them from the cross-lingual equivalents in the
multilingual labeling scenario, we could term them: *cross-lingual close
equivalents*? *Cross-lingual near equivalents*? Suggestions are welcome!!

2.            *Cross-lingual broad (narrow) equivalence relations*. These
would establish a relation between concepts with different levels of
granularity. This usually happens when one culture understands a concept or
phenomenon with a higher granularity than the other, i.e., one culture has
two or more concepts (and in its turn, terms for naming them) to describe
the same phenomenon. Example: river in English – rivière and fleuve in
French; Tötung in German – asesinato and homicidio in Spanish. Here again,
suggestions for better examples are welcome.

In the case no equivalent exists, we could still provide a term or
description, using for this a mixed scenario, i.e., providing some labels
or lexical entries for the concept we do not find an equivalent term in the
other ontology, as in the multilingual labeling approach. For this, we
consider two options:

3.            *Literal translation relations*. These are translations of
terms that describe concepts that do not exist in the target language, and
for which a literal or “word for word translation” is provided so that the
concept is understood by the target language. Example: École normal in
French– (French) Normal School in English; Presidente del Gobierno in
Spanish – President of the Government in English.

4.            *Descriptive translation relations.* These are translations
of terms that describe concepts that do not exist in the target language,
and for which a description or definition (and not a term) is provided in
the target language. Example: Panetone in Italian – bizcocho italiano que
se consume en Nochevieja in Spanish. In this case, we could also opt for
repeating the Italian Word plus the gloss.

In the latter two cases, we could also provide a link to the closest
equivalent or superclass (by means of the cross-lingual broad equivalence
relation), and additionally provide a literal or descriptive translation.

-------

As for the *relations* *between labels within the same language*, we
propose to talk about “term variation”.  For example:  what is the
difference between Advertising and Publicity, if any? And between
Contamination and Pollution?, or between Assisted conception, Artificial
insemination and in vitro Fertilization? In a SKOS Thesaurus, Assisted
conception is the main label, and the rest are alternative labels. However,
we think that we could be more specific regarding the type of variants
pointing to one and the same concept in the ontology, and that this should
be accounted for in our ontology-lexicon model. Sometimes, the difference
is a consequence of the contextual (pragmatic) usage, and we have to decide
whether to represent this in our model.

Based on previous classifications of terminology variation, we have
identified three main groups of term variants that include the following
types (see also [1] and [2]):

*Group 1*. Synomyms or terminological units that totally correspond to the
same concept:

   - graphical and orthographical variants (*localization *and* localisation
   *);
   - inflectional variants (*cat* and *cats*);
   - morphosyntactic variants (*nitrogen fixation* and *fixation of nitrogen
   *).

*Group 2*. Partial synonyms or terminological units that highlight
different aspects of the same concept:

   - stylistic or connotative variants (*man* and *bloke*)
   - diachronic variants (*tuberculosis* and *phthisis*)
   - dialectal variants (*gasoline* vs. *petrol*)
   - pragmatic or register variants (*headache* and *cephalalgia*; *swine
   flu* and *pig flu* and *H1N1* and *Mexic pandemic flu*)
   - explicative variants (*immigration law* and *law for regulating and
   controlling immigration*)

So, we would be very grateful for your suggestions and comments on this
proposal.

Best regards,

Lupe and Elena

[1] Montiel-Ponsoda, E., Aguado de Cea, G., McCrae, J. (2011). Representing
term variation in *lemon*. In Proceedings of the *WS 2Ontology and lexicon:
new insights, TIA 2011 - 9th International Conference on Terminology and
Artificial Intelligence*, pp. 47–50.

[2] Aguado de Cea, G., and Montiel-Ponsoda, E. (2012).  Term variants in
ontologies. In Proceedings of the AESLA (*Asociación Española de
Lingüística Aplicada*) Conference.



2012/7/18 Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>

> Dear all,
>
>  and just to clarify what the description of the requirements should
> include:
>
> Under "Description", there should be a general description of the
> requirement, its implications, etc. It is important that we think here in
> terms of requirements on the general model, not on particular data
> categories, properties, etc. but on requirements at the meta-model level.
>
> Under "Relevant Use Cases": here we should just list the IDs of the use
> cases touched by this requirement. Maybe this should be called "Affected
> Use Cases" ???
>
> "Relation to Use Case": here we should give detailed examples from the use
> cases where the requirement is important, thus grounding our requirements
> in the use cases we have collected.
>
> If there are any questions on this, just shoot.
>
> Best regards,
>
> Philipp.
>
>
>
> Am 18.07.12 14:24, schrieb Philipp Cimiano:
>
>  Dear ontolex members,
>>
>>  during our last meeting on the 6th of July, we discussed my condensed
>> list of requirements on the model and agreed that it looks promising to
>> work on the basis of these from now on.
>>
>> See here: http://www.w3.org/community/**ontolex/wiki/Specification_of_**
>> Requirements<http://www.w3.org/community/ontolex/wiki/Specification_of_Requirements>
>>
>> The older list of unstructured requirements is linked from the bottom of
>> the page.
>>
>> We fixed the following responsibles to produce a first draft of the
>> requirement and kick-off the discussions on this mailinglist. (We really
>> need to start the discussion on the relevant issues!)
>>
>> - Express Meaning with respect to ontology: John/Philipp/Aldo/Guido
>> - Lexical Variation and Paraphrases: Philipp
>> - Relation between lexical entries: Lupe/Elena
>> - Lexical and linguistic properties of lexical entries: John/Philipp
>> - Valence and Ontological Mapping: John/Philipp
>> - High-Order Predicate Mapping: John/Philipp
>> - Lexico-Syntactic Patterns: Elena/Dagmar
>> - Metadata about lexicon: Armando
>> - Modelling lexical resources: John/Aldo
>>
>> The goal would be to have a detailed specification and an ongoing
>> discussion on this mailinglist by end of August.
>>
>> The next teleconference will be on September 6th, 15:00 - 17:00 (CET). It
>> will be two hours as we decided to skip the one in August due to holiday
>> period.
>>
>> We also decided to have biweekly teleconferences from September on. I
>> think it is important to keep things moving quickly. Otherwise I have the
>> feeling that not much happens in between our monthly teleconferences.
>>
>> I am now on holidays for two weeks and will then start working on the
>> requirements assigned to me.
>> Needless to say, everyone should feel free to start working on their
>> requirements as soon as possible.
>>
>> If you think that an important requirement is missing, please post it on
>> the list and we will discuss it.
>>
>> Best regards,
>>
>> Philipp.
>>
>>
>
> --
> Prof. Dr. Philipp Cimiano
> Semantic Computing Group
> Excellence Cluster - Cognitive Interaction Technology (CITEC)
> University of Bielefeld
>
> Phone: +49 521 106 12249
> Fax: +49 521 106 12412
> Mail: cimiano@cit-ec.uni-bielefeld.**de <cimiano@cit-ec.uni-bielefeld.de>
>
> Room H-127
> Morgenbreede 39
> 33615 Bielefeld
>
>
>

Received on Sunday, 5 August 2012 19:57:06 UTC