Re: One lexical entry with multiple POSes from Fahad Khan on 2023-07-03 (public-ontolex@w3.org from July 2023)

From: Fahad Khan <fahad.khan@ilc.cnr.it>
Date: Mon, 3 Jul 2023 18:51:07 +0200
To: Christian Chiarcos <christian.chiarcos@gmail.com>
Cc: public-ontolex <public-ontolex@w3.org>
Message-ID: <CAK+N+9irExE9ED8wOnjqpegjP7p2WW6n4BpRK_T9VNa49q7QOw@mail.gmail.com>
Dear Christian,
The best solution would obviously be to get rid of the one POS per lexical
entry constraint (and I know of no convincing reason as to why we should
keep to this any longer). But since there is some reluctance to update the
guidelines except to correct minor typos, this is probably not going to
happen (and also if it did then that would remove one of the big
motivations for developing lexicog in the first place). However IMO there
is an ambiguity as to whether lexical entries are supposed to have *exactly*one
POS or *at most* one POS. This is especially the case since as we discussed
in a previous OntoLex call, affixes are also classed as lexical entries in
the model and these usually aren't associated with POSs. So a third
potential solution to your modelling dilemma would be indeed to assume that
a lexical entry can have zero or one POS values, and not to associate any
POSs with your lexical entry using lexinfo:partofspeech, but rather to use
some other property to specify that the categories noun and adjective are
relevant to your lexical entry (this solution has the benefit that you can
continue using lexical entry with its associated axioms).
Cheers
Fahad
PS. Given the capabilities of ChatGPT I wouldn't be so sure the task you
refer to couldn't be automated.

Il giorno lun 3 lug 2023 alle ore 16:52 Christian Chiarcos <
christian.chiarcos@gmail.com> ha scritto:

> Dear all,
>
> TL;DR: How to model the multi-POS entry
> https://www.dhfq.org/article/innocent-innocente with OntoLex?
>
> Long: OntoLex postulates a constraint for having one part-of-speech per
> lexical entry. For ontology lexicalization, this makes a lot of sense, but
> in the past, there also were some controversies because it partially
> clashes with the structure of real-world dictionaries (i.e., dictionary
> entries with more than one part of speech).
>
> The lexicog module introduced a possible solution, i.e., that multiple
> lexical entries can be grouped together into a single lexicog:Entry. This
> still has some downsides (e.g., sense definitions applicable to multiple
> parts of speech must be duplicated, because there must not be more than one
> lexical entry per sense), but works pretty well, as long as the original
> entry is actually structured in accordance to these parts of speech.
>
> In a course with students, we are currently exploring the applicability of
> OntoLex to a number of reference dictionaries from different Romance
> languages, and I would like to share one example that I consider critical,
> because it does not provide the neat partitioning of sub-entries into parts
> of speech, but conflates/switches between them several times. The entry
> "innocent-innocente" in the *Dictionnaire historique du français
> québécois* [1][2] describes both an adjective and a noun, with some
> portions applying to both POSes, others applying to one or the other POS,
> only, and while a human may be trained to disentangle them, I see no way
> how this could be automatized.
>
> I think OntoLex, if understood as the reference vocabulary for
> machine-readable lexical data in RDF, should be capable of representing
> such data without requiring human re-interpretation. Given the vocabularies
> we currently have, what would be your preferences? My current approach
> would be to create a lexicog:Entry, to link it with forms and senses, but
> to *not create a ontolex:LexicalEntry* at all. This is in line with the
> open world assumption, and if this is a practice we can agree upon, we
> should probably add that as a clarification note to the Lexicog definition.
> Note that this entails that lexicog:Entry can also organize ontolex:Forms
> (this is not prohibited right now in lexicog, but also not mentioned as a
> possibility).
>
> (I'm personally more in favour of lifting the one-POS-per-entry
> constraint, because it leads to a more efficient modelling, but that hasn't
> found much support so far, and using lexicog:Entrys instead of
> ontolex:LexicalEntries wouldn't contradict any existing vocabulary.)
>
> (NB: TEI provides something like a solution here, by the markable
> <entryFree> [3], but IMHO this would not be a good idea in the context of
> OntoLex, because the definition "a single unstructured entry" basically
> states that we leave the realm of well-defined semantics ... but semantics
> are actually very clear here, just not expressible with OntoLex core
> vocabulary.)
>
> Any ideas?
>
> All the best,
> Christian
>
> [1] https://www.dhfq.org/article/innocent-innocente (French)
> [2] English:
> https://www-dhfq-org.translate.goog/article/innocent-innocente?_x_tr_sl=auto&_x_tr_tl=de&_x_tr_hl=de&_x_tr_pto=wapp
> [3] https://tei-c.org/release/doc/tei-p5-doc/en/html/ref-entryFree.html
>
Received on Monday, 3 July 2023 16:51:25 UTC