Re: multilingual lexical entries?

Dear all,

just for clarification, the following is what I would like to do:

:sze_le a ontolex:LexicalEntry;
ontolex:canonicalForm [
ontolex:writtenRep "𒊺"; # or: ontolex:writtenRep "𒊺"@sux-Xsux,
ontolex:writtenRep
"𒊺"@akk-Xsux
ontolex:writtenRep "sze"; # transliteration
ontolex:writtenRep "sze"@sux-Latn; # transcription
ontolex:writtenRep "uţţatu"@akk-Latn # transcription
]; ontolex:sense [ rdfs:comment "unit of weight, approx 0.04 g" ].

The alternative with lexicog:Entry (and without duplicating LexicalEntries)
would be

:sze_le a lexicog:Entry;
lexicog:describes [ a ontolex:Form;
ontolex:writtenRep "𒊺";
ontolex:writtenRep "sze"; # transliteration
ontolex:writtenRep "sze"@sux; # transcription
ontolex:writtenRep "uţţatu"@akk # transcription ... IMHO different language
tags should be unproblematic for forms
]; lexicog:describes [ a ontolex:LexicalSense; rdfs:comment "unit of
weight, approx 0.04 g"].

The latter way of modelling should be in line with the documentation, but
it makes large parts of OntoLex-Lemon redundant and others (e.g.,
canonicalForm) inapplicable, I would prefer to avoid that.

Best,
Christian

Am Di., 7. Dez. 2021 um 16:32 Uhr schrieb Christian Chiarcos <
christian.chiarcos@gmail.com>:

> Dear all,
>
> for different use cases, I came across the need to provide one lexical
> entry for multiple languages.
>
> In one group of cases (esp., etymological dictionaries), this can be
> circumvented by using lexicog:Entry, instead, and then point to
> language-specific lexical entries. (Though this is very inelegant,
> unnecessarily verbose and clearly a departure from/obfuscation of the
> original structure of the lexical resource, but technically, it is a
> possibility.)
>
> However, in another case (dictionaries/glossaries for cuneiform
> languages), we have the problem that we cannot always tell what language a
> text (and thus, a word) is in. This is because of the multilingual
> situation of Sumerian and Akkadian during the 3rd m. BC, because of the use
> of ideographic signs, because of the laziness of scribes to often not write
> morphemes, but just the stem of a word, and because of the habit of
> Akkadian and Hittite scibes to just write Sumerian (or Akkadian) words
> instead of their native tongue because these were more established in the
> writing tradition. Although there are phonological or morphological
> complements that can reveal the language, these are not systematically
> used, so that we have uncertainties about the language of individual words
> or even entire texts. However, if these texts form the basis for a glossary
> or dictionary, these uncertainties percolate to the glossary, especially if
> it is corpus-based. The Electronic Penn Sumerian dictionary thus does not
> distinguish Sumerian and Akkadian forms and just groups everything under
> the same head word and just provides Sumerian and Akkadian readings of the
> same sign. (The selection of texts is such that a Sumerian reading is more
> likely, but it is not always necessary.) In some cases in this dictionary,
> it is even marked that there are doubts that a word is Sumerian in the
> first place (http://oracc.museum.upenn.edu/epsd2/cbd/sux/o0023151.html).
>
> Such data does not allow to create distinct lexical entries for both (or,
> in case of Hittite texts, three) languages that would just go under the
> same lexicog:Entry, because we cannot decide which information (other than
> the possible Sumerian and Akkadian interpretations of the same Cuneiform
> writtenRep) belongs to which lexical entry.
>
> For this reason, we are currently considering to have language-agnostic
> lexical entries for a future CDLI glossary (https://cdli.ucla.edu/),
> where language information is provided only at the form (or even, within
> the writtenRep), but not at the lexical entry. Note that there is no
> constraint in the OntoLex core model that requires a single language per
> lexical entry.
>
> What OntoLex says about language is not in the core model, but in Lime:
> "note that all entries in the same lexicon should be in the same language
> and that the language of the lexicon and entry should be consistent with
> the language tags used on all forms". This a comment (in parenthesis, in
> accompanying text, and if assumed to be relevant for the definition of
> ontolex:LexicalEntry, in the wrong place), formulated as a recommendation
> and not part of any definition.
>
> If we consider this statement to be nevertheless binding, the CDLI
> solution would be to create a dictionary with senses and lexicog:Entrys,
> but without ontolex:Entrys. I would prefer not to. (I would still prefer to
> avoid multilingual lexical entries in cases in which language-specific
> information is provided, and thus to keep the recommendation in place, as
> is, but this is not the case here.)
>
> Best,
> Christian
>

Received on Wednesday, 8 December 2021 13:53:29 UTC