- From: Christian Chiarcos <christian.chiarcos@gmail.com>
- Date: Tue, 4 Jul 2023 13:25:53 +0200
- To: Jorge Gracia del Río <jogracia@unizar.es>
- Cc: Fahad Khan <fahad.khan@ilc.cnr.it>, public-ontolex <public-ontolex@w3.org>
- Message-ID: <CAC1YGdhaP2RJsWSUFBQ5srLXpOxRt=OdvPrcN-iNj-Wz0ggxCg@mail.gmail.com>
Dear Jorge, Am Di., 4. Juli 2023 um 12:19 Uhr schrieb Jorge Gracia del Río < jogracia@unizar.es>: > From my side, I fully support Ilan's view on this. Trying to adapt the > model to the restrictions and needs of every single dictionary is not > feasible. > Of course not. This example is a particularly nasty one, indeed, because there is no structural unit (lexicog:Component) we could directly identify with a sub-entry for a particular POS (instead, there are multiple such structural units). But multiple or underspecified POSes are a frequently recurring issue. In RDF semantics, underspecified POSes are actually not a problem because of the open world assumption, and Lexicog can handle multiple POSes. The nasty part here is that when using Lexicog, we simply cannot automatically create ontolex:LexicalEntries for the POS-specific entries because it is hard to tell (for a converter) which part of the description applies to one, the other or both. Of course I am in favour of adaptations to the model and to work on its > evolution, but we need to be cautions and not to re-interpret the model to > adapt it to any possible legacy dictionary (e.g., by moving POS from the > Lexical Entry to the Form), due to the risk of hampering interoperability > across a plethora of existing and future lemon-based lexical data. > Indeed. But note that there is nothing in OntoLex that ties lexinfo:partOfSpeech exclusively to lexical entries. https://www.w3.org/2016/05/ontolex/#linguistic-description merely states for lexinfo:partOfSpeech and other subproperties of lexinfo:morphosyntacticProperty that "By default, it should be assumed that a property of a lexical entry also holds for all its forms." I take this to mean that lexinfo:partOfSpeech is optional for forms, but not forbidden. In LexInfo, neither morphosyntacticProperty nor its subproperty partOfSpeech are given a domain. > In your particular case, I'd go for the lexicog solution with one lexical > entry per POS, and duplicating lexical senses if needed (which actually > won't be duplicated since they will be connecting different things). Or, > curating the source data to avoid existing imprecisions. > I think manual curation is not an option, because we'd try to have a faithful representation first, before reinterpreting it (which needs a fair command of French). And that would be needed for basically every entry, so we talk about weeks to months of work (weeks in this case, it's small). Without that manual curation, the lexicog solution with one lexical entry per POS means to have one lexicog:Entry without any lexical entry (which is ok from a modelling perspective, just incomplete data, as incomplete as the original data). Right now, that would be my preference, too. In this particular case, the nominal sense has no definition, at all. So, "Qui témoigne d’un manque d’intelligence, de jugement, de savoir-faire, qui est niais, bête, idiot" as given with the adjective is expected to be applied here, as well. (I think, "idiot" must be a noun, so that's actually a nominal definition, isn't it?) So, that part needs to be duplicated, at least, likewise, the list of synonyms given in bold (so, from one link between innocent-innocente and epais-epaisse, we generate 18=2 x 9 lexinfo:synonym links between the different senses, we have some polynomial growth here). But I don't see a better alternative either. Thank you, Christian
Received on Tuesday, 4 July 2023 11:26:11 UTC