- From: Christian Chiarcos <christian.chiarcos@web.de>
- Date: Wed, 24 Jun 2020 19:32:09 +0200
- To: public-ontolex <public-ontolex@w3.org>, "Thierry Declerck" <declerck@dfki.de>
- Cc: "Bajcetic, Lenka" <Lenka.Bajcetic@oeaw.ac.at>, "chiarcos@informatik.uni-frankfurt.de" <chiarcos@informatik.uni-frankfurt.de>
Hi Thierry, dear Lenka, dear all, actually, the semantics of function words are an extremely complex area -- at least in the second you go beyond Germanic and Romance, and there is a lot of related work in discourse studies, semantics, psycholinguistics, and (pre-neural) computational linguistics that amounted for decades. Computational approaches on function word semantics include a hierarchy of preposition supersenses (see https://github.com/nert-nlp/streusle -- prepositions only), efforts to align (and decompose) various theories of discourse relations (http://www.textlink.ii.metu.edu.tr/ -- this is for adverbs and conjunctions), etc. Trouble is that the complexity of these approaches exceeds even the complexity of resources such as FrameNet (because they build on that), and that the cross-linguistic dimensions have been barely tackled so far. As for the (formal) semantics of determiners and pronouns, these are a *very* complicated matter as a unified theory of reference and deixis has not even emerged at a theoretical level (with promising approaches in Levelt 1989, Ariel 1991, Gundel et al. 1993, Grosz et al. 1995 -- and a drop of interest from the computational/semantic side since the mid-2000s). In the 1990s, von Heusinger developed an appealing approach by formalising definiteness in terms of a selection function that operates over an underlying salience metric that basically provides a contextually determined ranking of possible antecedent candidates. The point here is, however, that this ranking is contextually determined, so it is very hard to formalize it in a context-independent fashion (we would need to agree on *one* metric and a threshold, or another selection criterion). The semantics of determiners are, however, not limited to definiteness, but they are also sensitive to distance (e.g., in Macedonian, if I'm not mistaken), specificity (rather than definiteness, this is what Farsi determiners mark). And of course you'll find all kinds of grammaticalizations where the underlying semantic or disourse function is barely recognizable (think of Slavic full and reduced adjective inflections, e.g., horosho vs. horoshoe in Russian, these originate from a clitic determiner as still preserved in Lithuanian, but this is not their function anymore). And then there are languages that just mark different things, e.g., topic and focus markers in African or Asian languages, classifiers, reference tracking in languages without "proper" pronouns. In essence, the enterprise to formalize the meaning (function words are semantically bleached, so this is not actually lexical semantics) would amount to the development of a "universal" computational theory of grammar. I would very much welcome an open discussion and participate in that -- in parts at least -- and I am working on this (this brought me into linked data because linguistic research / language technology on that complex level requires a level of language resource interoperability that no other technology seems to capable to establish). In fact, the OLiA Discourse Extensions (http://www.acoli.informatik.uni-frankfurt.de/resources/discourse/) have been a proposal to develop something in this direction, but they still remain too shallow and focused solely on existing annotation schemes rather than on formalizing the underlying functions. On the practical side, I am, however, very much convinced that this exceeds *WAAY* beyond the scope of OntoLex (as the phenomena involved exceed the scope of lexical semantics, but include grammar and context). I think the most likely place to discuss this would be in the context of the SemAF specifications, but this is ISO and not an open discussion. The ACL SIGs SIGDial and SIGSEM would be less restricted. There are workshops and shared tasks, SemEval, StarSem, etc. These could be a point to start an open discussion, maybe with a position paper [actually, I would like to work on something like that ...]. The trouble is that there are some fundamental issues that still need to be solved (and are being worked on by several communities) before the modelling aspect can even be thought about. Best, Christian Am .06.2020, 14:51 Uhr, schrieb Thierry Declerck <declerck@dfki.de>: > Dear All, > > Lenka (in cc, in case she is not yet inlcuded in the mailing list) and > myself had recently a discussion on the topics of entries of the > so-called type "closed classes", meaning with this determiners, > prepositions, auxiliary verbs, pronouns, ... > > We were wondering if there were already discussions on how to encode > those in OntoLex-Lemon. The question is mainly on the "semantic by > reference". It *seems* straightforward to encode this way in OntoLex > the semantics of nouns, but what about determiners and pronouns (and the > like)? Are there some pointers to this thema, and do you think it would > be worth to open a discussion on this, if there are not already > solutions (I am not aware of. In this case, sorry for that). > > Thanks > > Thierry
Received on Wednesday, 24 June 2020 17:32:23 UTC