- From: John McCrae <jmccrae@cit-ec.uni-bielefeld.de>
- Date: Fri, 28 Sep 2012 15:35:32 +0200
- To: public-ontolex@w3.org
- Message-ID: <CAC5njqp4qtYBAsWp+MvY+O8=-u3-DYUFQ6cQfRuzkvwGLNtpxQ@mail.gmail.com>
Hi, I summarized the discussion of this thread here http://www.w3.org/community/ontolex/wiki/Specification_of_Requirements/Properties-and-Relations-of-Entries Can you please check it is a good summary? Regards, John On Fri, Aug 24, 2012 at 1:01 PM, John McCrae < jmccrae@cit-ec.uni-bielefeld.de> wrote: > Sorry I think in that case I meant to merge "Relations between lexical > entries" and Lexical and linguistic properties of lexical entries" > > Regards, > John > > > On Fri, Aug 24, 2012 at 12:09 PM, Philipp Cimiano < > cimiano@cit-ec.uni-bielefeld.de> wrote: > >> ** >> John, all, >> >> concerning the "lexical variant an paraphrases" I had something quite >> different in mind. The requirement I had in mind here is that the lexicon >> should capture different (lexicalized) constructions for expressing one and >> the same concept or property. >> >> Take the property leaderOfGroup(X,Y). This can be expressed in very >> different ways: >> >> X leads Y >> X heads Y >> X is the leader of Y >> X is the head of Y >> X is the boss of Y >> Y's leader X >> Y's leader/head is X >> >> same for a property like artist(X,Y), which can be expressed as: >> >> Y created X >> Y is the creator of Y >> Y painted X (if it is a painting) >> X's painter/creator is Y >> >> etc. etc. >> >> This is what I meant with lexical variants and paraphrases. And I think >> this should not be conflated with your list of variants 1-3 (which I agree >> with actually ;-) >> >> Best regards, >> >> Philipp. >> >> >> >> >> Am 20.08.12 18:34, schrieb John McCrae: >> >> Hi all, >> >> Can I suggest we merge the following requirements on "Lexical Variant and >> Paraphrases" and "Lexical and linguistic properties of lexical entries"? >> >> My reasoning is that it seems that what Lupe is suggesting relies heavily >> on the definition of properties. i.e., to model geographical variants, >> register variants or diachronic variants, we need to be able to state the >> geographical, register or diachronic properties of the two variants. As >> such *we can think of variation in terms of the properties that vary*and those that do not. Put more clearly, variants are entries that are >> similar (have the same property values) except for some property, e.g., >> translation is variation in language, pluralization is variation in number, >> etc. >> >> Considering the list of variants above, the following properties are >> preserved by the type of variance >> >> - Orthographic variants. Preserved: Pronounciation, syntax, most >> pragmatic, semantic properties. Differs: Generally context or geographic >> usage >> - Inflectional variants. Preserved: part-of-speech, pragmatic and >> semantic properties. >> - Morphosyntactic variants. Preserved: semantic properties, most >> pragmatic properties. >> - Stylistic+Register variants. Preserved: semantics. >> - Diachronic variants. Preserved: semantics >> - Dialectical variants. Preserved: semantics >> - Explicative variants. Preserved: extensional semantics (not >> intensional) >> - Semantic variants. Preserved: partial semantics >> >> As such I would go for splitting up the categories as follows >> >> *Group 1a. Orthographic variants* >> >> - Historical Orthographic variants. e.g., different scripts such as >> for Azeri (http://en.wikipedia.org/wiki/Azerbaijani_alphabet) >> - Geo-orthographic variants. e.g., "localize" vs. "localise" >> - Semantic-orthographics variants. e.g., "取る" (toru - "to take >> (remove from a location)") vs "撮る" (toru - "to take (a photo)") >> >> *Group 1b. Inflectional variants* >> >> - Pluralization, verb form inflection, comparatives and superlatives >> - Synthesis (see http://en.wikipedia.org/wiki/Synthetic_language) >> >> *Group 1c. Morphosyntactic Variants* >> >> - Rephrasing: e.g., "cancer of the mouth" vs. "mouth cancer" >> - Derivation (e.g., Nominalization): e.g., "lexicon", "lexical", >> "lexicalize" >> - Pleonasm: "tuna" vs "tuna fish" >> - Abbreviation: e.g., AIDS..... Philipp> Any variation has some >> (sight) pragmatic implication, abbreviation for me is morphosyntactic as >> the motivation is brevity rather than connotation. >> >> *Group 2a. Pragmatic Variants* >> >> - stylistic or connotative variants (man and bloke) >> - diachronic variants (tuberculosis and phthisis) >> - dialectal variants (gasoline vs. petrol) >> - pragmatic or register variants (headache and cephalalgia; swine flu >> and pig flu and H1N1 and Mexic pandemic flu) >> >> *Group 2b. Circumlocutive variants* >> >> - explicative variants (immigration law and law for regulating and >> controlling immigration) >> >> *Group 3. Non-synonymous variants* >> >> - Modification: "MRSA", vs "hospital-acquired MRSA" >> - Hypernym/Hyponymy/Antonymy >> - Cross-lingual narrowing/broadening: "river" vs "rivière/fleuve" >> >> >> Does this sound sensible or did I miss something? >> >> Regards, >> John >> >> On Sun, Aug 5, 2012 at 9:56 PM, lupe aguado <gac280771@gmail.com>wrote: >> >>> Dear Ontolex members >>> >>> With this message we would like to start the discussion about the >>> requirements on “Relations between lexical entries”. I put the message as a >>> draft in the Ontolex community Group and forgot to send it to you. Sorry! >>> >>> In our opinion, two types of relations need to be taken into account in >>> an ontology-lexicon model: >>> >>> 1. *relations between labels in different natural languages,* and >>> 2. *relations between labels within the same natural language.* >>> >>> Before continuing, we would like to define the two scenarios that we >>> envisage: >>> >>> 1. *A. **Multilingual labeling approach* >>> >>> In a multilingual labeling approach, we have a single conceptual >>> structure, and we provide alternative labeling information in the >>> ontology-lexicon model for each of the languages covered (in the same >>> language or in different languages). This is possible whenever the >>> languages covered share a single view on a certain domain. In this >>> case, there will always be one or several labels in each natural language >>> for naming or terming the concepts in the ontology. >>> >>> 1. *B. **Cross-lingual linking or mapping approach* >>> >>> In this second scenario, there exist two independent monolingual >>> ontologies, defined in different languages, but covering the same or >>> similar subject domain. We aim at establishing links between the labels >>> that describe the two ontologies. The establishment of these cross-lingual >>> links could derive in cross-lingual ontology mappings. In this scenario, >>> the conceptual structure of each ontology is modeled independently, and >>> “linguistic links” or “mappings” can be established between the two. >>> >>> --------- >>> >>> Now, in a *multilingual labeling approach*, we will usually refer to >>> “cross-lingual equivalents”. Let us take for example an ontology of >>> medical conditions. In such an ontology we can find terms such as menopause >>> in English, and its cross-lingual equivalents: menopause in French, >>> menopause in Danish, vaihdevuodet in Finnish or Menopause in German. This >>> means that the “same” concept exists in the involved cultures and has an >>> equivalent term in the corresponding language. >>> >>> On the contrary, in a *cross-lingual linking or mapping approach*, we >>> could come across several types of relations among lexical entries due to >>> the following reasons: >>> >>> - conceptualization mismatches >>> - different levels of granularity >>> >>> In fact, granularity or viewpoint differences may also come up in a >>> “monolingual” linking or mapping approach. However, conceptualization >>> mismatches will be more common in a cross-lingual scenario. In this sense, >>> we could account for several types of relations >>> >>> 1. *Cross-lingual equivalence relations*, as in the >>> multilingual labeling scenario. These would establish a relation between >>> concepts that are not exactly the same (do not have the same intension >>> and/or extension), but are close equivalents, because no exact equivalent >>> exists. Example: full professor in English – catedrático in Spanish – >>> Professor in German. In order to distinguish them from the cross-lingual >>> equivalents in the multilingual labeling scenario, we could term them: *cross-lingual >>> close equivalents*? *Cross-lingual near equivalents*? Suggestions are >>> welcome!! >>> >>> 2. *Cross-lingual broad (narrow) equivalence relations*. >>> These would establish a relation between concepts with different levels of >>> granularity. This usually happens when one culture understands a concept or >>> phenomenon with a higher granularity than the other, i.e., one culture has >>> two or more concepts (and in its turn, terms for naming them) to describe >>> the same phenomenon. Example: river in English – rivière and fleuve in >>> French; Tötung in German – asesinato and homicidio in Spanish. Here again, >>> suggestions for better examples are welcome. >>> >>> In the case no equivalent exists, we could still provide a term or >>> description, using for this a mixed scenario, i.e., providing some labels >>> or lexical entries for the concept we do not find an equivalent term in the >>> other ontology, as in the multilingual labeling approach. For this, we >>> consider two options: >>> >>> 3. *Literal translation relations*. These are translations >>> of terms that describe concepts that do not exist in the target language, >>> and for which a literal or “word for word translation” is provided so that >>> the concept is understood by the target language. Example: École normal in >>> French– (French) Normal School in English; Presidente del Gobierno in >>> Spanish – President of the Government in English. >>> >>> 4. *Descriptive translation relations.* These are >>> translations of terms that describe concepts that do not exist in the >>> target language, and for which a description or definition (and not a term) >>> is provided in the target language. Example: Panetone in Italian – bizcocho >>> italiano que se consume en Nochevieja in Spanish. In this case, we could >>> also opt for repeating the Italian Word plus the gloss. >>> >>> In the latter two cases, we could also provide a link to the closest >>> equivalent or superclass (by means of the cross-lingual broad equivalence >>> relation), and additionally provide a literal or descriptive translation. >>> >>> ------- >>> >>> As for the *relations* *between labels within the same language*, we >>> propose to talk about “term variation”. For example: what is the >>> difference between Advertising and Publicity, if any? And between >>> Contamination and Pollution?, or between Assisted conception, Artificial >>> insemination and in vitro Fertilization? In a SKOS Thesaurus, Assisted >>> conception is the main label, and the rest are alternative labels. However, >>> we think that we could be more specific regarding the type of variants >>> pointing to one and the same concept in the ontology, and that this should >>> be accounted for in our ontology-lexicon model. Sometimes, the difference >>> is a consequence of the contextual (pragmatic) usage, and we have to decide >>> whether to represent this in our model. >>> >>> Based on previous classifications of terminology variation, we have >>> identified three main groups of term variants that include the following >>> types (see also [1] and [2]): >>> >>> *Group 1*. Synomyms or terminological units that totally correspond to >>> the same concept: >>> >>> - graphical and orthographical variants (*localization *and*localisation >>> *); >>> - inflectional variants (*cat* and *cats*); >>> - morphosyntactic variants (*nitrogen fixation* and *fixation of >>> nitrogen*). >>> >>> *Group 2*. Partial synonyms or terminological units that highlight >>> different aspects of the same concept: >>> >>> - stylistic or connotative variants (*man* and *bloke*) >>> - diachronic variants (*tuberculosis* and *phthisis*) >>> - dialectal variants (*gasoline* vs. *petrol*) >>> - pragmatic or register variants (*headache* and *cephalalgia*; *swine >>> flu* and *pig flu* and *H1N1* and *Mexic pandemic flu*) >>> - explicative variants (*immigration law* and *law for regulating >>> and controlling immigration*) >>> >>> So, we would be very grateful for your suggestions and comments on this >>> proposal. >>> >>> Best regards, >>> >>> Lupe and Elena >>> >>> [1] Montiel-Ponsoda, E., Aguado de Cea, G., McCrae, J. (2011). >>> Representing term variation in *lemon*. In Proceedings of the *WS >>> 2Ontology and lexicon: new insights, TIA 2011 - 9th International >>> Conference on Terminology and Artificial Intelligence*, pp. 47–50. >>> >>> [2] Aguado de Cea, G., and Montiel-Ponsoda, E. (2012). Term variants in >>> ontologies. In Proceedings of the AESLA (*Asociación Española de >>> Lingüística Aplicada*) Conference. >>> >>> >>> >>> 2012/7/18 Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de> >>> >>>> Dear all, >>>> >>>> and just to clarify what the description of the requirements should >>>> include: >>>> >>>> Under "Description", there should be a general description of the >>>> requirement, its implications, etc. It is important that we think here in >>>> terms of requirements on the general model, not on particular data >>>> categories, properties, etc. but on requirements at the meta-model level. >>>> >>>> Under "Relevant Use Cases": here we should just list the IDs of the use >>>> cases touched by this requirement. Maybe this should be called "Affected >>>> Use Cases" ??? >>>> >>>> "Relation to Use Case": here we should give detailed examples from the >>>> use cases where the requirement is important, thus grounding our >>>> requirements in the use cases we have collected. >>>> >>>> If there are any questions on this, just shoot. >>>> >>>> Best regards, >>>> >>>> Philipp. >>>> >>>> >>>> >>>> Am 18.07.12 14:24, schrieb Philipp Cimiano: >>>> >>>> Dear ontolex members, >>>>> >>>>> during our last meeting on the 6th of July, we discussed my condensed >>>>> list of requirements on the model and agreed that it looks promising to >>>>> work on the basis of these from now on. >>>>> >>>>> See here: >>>>> http://www.w3.org/community/ontolex/wiki/Specification_of_Requirements >>>>> >>>>> The older list of unstructured requirements is linked from the bottom >>>>> of the page. >>>>> >>>>> We fixed the following responsibles to produce a first draft of the >>>>> requirement and kick-off the discussions on this mailinglist. (We really >>>>> need to start the discussion on the relevant issues!) >>>>> >>>>> - Express Meaning with respect to ontology: John/Philipp/Aldo/Guido >>>>> - Lexical Variation and Paraphrases: Philipp >>>>> - Relation between lexical entries: Lupe/Elena >>>>> - Lexical and linguistic properties of lexical entries: John/Philipp >>>>> - Valence and Ontological Mapping: John/Philipp >>>>> - High-Order Predicate Mapping: John/Philipp >>>>> - Lexico-Syntactic Patterns: Elena/Dagmar >>>>> - Metadata about lexicon: Armando >>>>> - Modelling lexical resources: John/Aldo >>>>> >>>>> The goal would be to have a detailed specification and an ongoing >>>>> discussion on this mailinglist by end of August. >>>>> >>>>> The next teleconference will be on September 6th, 15:00 - 17:00 (CET). >>>>> It will be two hours as we decided to skip the one in August due to holiday >>>>> period. >>>>> >>>>> We also decided to have biweekly teleconferences from September on. I >>>>> think it is important to keep things moving quickly. Otherwise I have the >>>>> feeling that not much happens in between our monthly teleconferences. >>>>> >>>>> I am now on holidays for two weeks and will then start working on the >>>>> requirements assigned to me. >>>>> Needless to say, everyone should feel free to start working on their >>>>> requirements as soon as possible. >>>>> >>>>> If you think that an important requirement is missing, please post it >>>>> on the list and we will discuss it. >>>>> >>>>> Best regards, >>>>> >>>>> Philipp. >>>>> >>>>> >>>> >>>> -- >>>> Prof. Dr. Philipp Cimiano >>>> Semantic Computing Group >>>> Excellence Cluster - Cognitive Interaction Technology (CITEC) >>>> University of Bielefeld >>>> >>>> Phone: +49 521 106 12249 <%2B49%20521%20106%2012249> >>>> Fax: +49 521 106 12412 <%2B49%20521%20106%2012412> >>>> Mail: cimiano@cit-ec.uni-bielefeld.de >>>> >>>> Room H-127 >>>> Morgenbreede 39 >>>> 33615 Bielefeld >>>> >>>> >>>> >>> >> >> >> -- >> Prof. Dr. Philipp Cimiano >> Semantic Computing Group >> Excellence Cluster - Cognitive Interaction Technology (CITEC) >> University of Bielefeld >> >> Phone: +49 521 106 12249 >> Fax: +49 521 106 12412 >> Mail: cimiano@cit-ec.uni-bielefeld.de >> >> Room H-127 >> Morgenbreede 39 >> 33615 Bielefeld >> >> >
Received on Friday, 28 September 2012 13:36:16 UTC