- From: Jorge Gracia del Río <jogracia@unizar.es>
- Date: Wed, 5 Jan 2022 10:38:39 +0100
- To: Christian Chiarcos <christian.chiarcos@gmail.com>
- Cc: public-ontolex <public-ontolex@w3.org>
- Message-ID: <CAMe8T+s8cFH2Gpi8CeStOofTKY4nwmhTpMsPr+g3Yj7Coqsn8w@mail.gmail.com>
Dear Christian, What about this other approximation? That is, creating a "language-agnostic" lexicog:entry per known record in the dictionary, and then instantiate lexical entries to account for the language specific information: :sze_concept a ontolex:LexicalConcept; skos:definition "unit of weight, approx 0.04 g" . :sze_sux a ontolex:LexicalEntry; ontolex:canonicalForm [ ontolex:writtenRep "𒊺"@sux-Xsux; ontolex:writtenRep "sze"@sux-Latn ] . :sze_akk a ontolex:LexicalEntry; ontolex:canonicalForm [ ontolex:writtenRep "𒊺"@akk-Xsux; ontolex:writtenRep "uţţatu"@akk-Latn ] . : sze_concept ontolex:isEvokedBy :sze_sux:, sze_akk . :sze_entry a lexicog:Entry ; lexicog:describes sze_sux, :sze_akk . Best regards, Jorge El mié, 8 dic 2021 a las 14:54, Christian Chiarcos (< christian.chiarcos@gmail.com>) escribió: > Dear all, > > just for clarification, the following is what I would like to do: > > :sze_le a ontolex:LexicalEntry; > ontolex:canonicalForm [ > ontolex:writtenRep "𒊺"; # or: ontolex:writtenRep "𒊺"@sux-Xsux, ontolex:writtenRep > "𒊺"@akk-Xsux > ontolex:writtenRep "sze"; # transliteration > ontolex:writtenRep "sze"@sux-Latn; # transcription > ontolex:writtenRep "uţţatu"@akk-Latn # transcription > ]; ontolex:sense [ rdfs:comment "unit of weight, approx 0.04 g" ]. > > The alternative with lexicog:Entry (and without duplicating > LexicalEntries) would be > > :sze_le a lexicog:Entry; > lexicog:describes [ a ontolex:Form; > ontolex:writtenRep "𒊺"; > ontolex:writtenRep "sze"; # transliteration > ontolex:writtenRep "sze"@sux; # transcription > ontolex:writtenRep "uţţatu"@akk # transcription ... IMHO different > language tags should be unproblematic for forms > ]; lexicog:describes [ a ontolex:LexicalSense; rdfs:comment "unit of > weight, approx 0.04 g"]. > > The latter way of modelling should be in line with the documentation, but > it makes large parts of OntoLex-Lemon redundant and others (e.g., > canonicalForm) inapplicable, I would prefer to avoid that. > > Best, > Christian > > Am Di., 7. Dez. 2021 um 16:32 Uhr schrieb Christian Chiarcos < > christian.chiarcos@gmail.com>: > >> Dear all, >> >> for different use cases, I came across the need to provide one lexical >> entry for multiple languages. >> >> In one group of cases (esp., etymological dictionaries), this can be >> circumvented by using lexicog:Entry, instead, and then point to >> language-specific lexical entries. (Though this is very inelegant, >> unnecessarily verbose and clearly a departure from/obfuscation of the >> original structure of the lexical resource, but technically, it is a >> possibility.) >> >> However, in another case (dictionaries/glossaries for cuneiform >> languages), we have the problem that we cannot always tell what language a >> text (and thus, a word) is in. This is because of the multilingual >> situation of Sumerian and Akkadian during the 3rd m. BC, because of the use >> of ideographic signs, because of the laziness of scribes to often not write >> morphemes, but just the stem of a word, and because of the habit of >> Akkadian and Hittite scibes to just write Sumerian (or Akkadian) words >> instead of their native tongue because these were more established in the >> writing tradition. Although there are phonological or morphological >> complements that can reveal the language, these are not systematically >> used, so that we have uncertainties about the language of individual words >> or even entire texts. However, if these texts form the basis for a glossary >> or dictionary, these uncertainties percolate to the glossary, especially if >> it is corpus-based. The Electronic Penn Sumerian dictionary thus does not >> distinguish Sumerian and Akkadian forms and just groups everything under >> the same head word and just provides Sumerian and Akkadian readings of the >> same sign. (The selection of texts is such that a Sumerian reading is more >> likely, but it is not always necessary.) In some cases in this dictionary, >> it is even marked that there are doubts that a word is Sumerian in the >> first place (http://oracc.museum.upenn.edu/epsd2/cbd/sux/o0023151.html). >> >> Such data does not allow to create distinct lexical entries for both (or, >> in case of Hittite texts, three) languages that would just go under the >> same lexicog:Entry, because we cannot decide which information (other than >> the possible Sumerian and Akkadian interpretations of the same Cuneiform >> writtenRep) belongs to which lexical entry. >> >> For this reason, we are currently considering to have language-agnostic >> lexical entries for a future CDLI glossary (https://cdli.ucla.edu/), >> where language information is provided only at the form (or even, within >> the writtenRep), but not at the lexical entry. Note that there is no >> constraint in the OntoLex core model that requires a single language per >> lexical entry. >> >> What OntoLex says about language is not in the core model, but in Lime: >> "note that all entries in the same lexicon should be in the same language >> and that the language of the lexicon and entry should be consistent with >> the language tags used on all forms". This a comment (in parenthesis, in >> accompanying text, and if assumed to be relevant for the definition of >> ontolex:LexicalEntry, in the wrong place), formulated as a recommendation >> and not part of any definition. >> >> If we consider this statement to be nevertheless binding, the CDLI >> solution would be to create a dictionary with senses and lexicog:Entrys, >> but without ontolex:Entrys. I would prefer not to. (I would still prefer to >> avoid multilingual lexical entries in cases in which language-specific >> information is provided, and thus to keep the recommendation in place, as >> is, but this is not the case here.) >> >> Best, >> Christian >> >
Received on Wednesday, 5 January 2022 09:40:05 UTC