Re: OntoLex FrAC module telco tomorrow, Nov 26, 12:00 CET

From: Fahad Khan <anasfkhan81@gmail.com>
Date: Wed, 25 Nov 2020 19:28:11 +0100
Message-ID: <CAK+N+9hcXAPTequQqeMtA2d-jev8g60YQkLkPXmBQNz3T8OyaQ@mail.gmail.com>
To: Max Ionov <max.ionov@gmail.com>
Cc: public-ontolex <public-ontolex@w3.org>, WG1 mailing list for the Nexus project <nexus-wg1@listas.fi.upm.es>
Hi Everyone,
As suggested in the last meeting I tried to do some background reading and
find some examples of colligation for the telco. From what I see it is
probably a bit too complicated to model in the framework of the Ontolex
FRaC module and I don't know how commonly colligation information is
featured in actual dictionaries/lexicons (I would be looking forward to
hearing from someone with more lexicographic expertise on this topic).
Below are my notes, including some relevant points re the definition of
collocations too (references are given for each source consulted subsequent
to the quotation/summary):

*Collocation + Colligation*: likelihood of co-occurrence of (two or more)
lexical items and grammatical categories, respectively.

*Collocation*: 'refers to the syntagmatic attraction between two (or more)
lexical items: morphemes, words, phrases or utterances. Most often,
however, collocation analyses have been conducted on the word-level'...'The
strength of this kind of attraction between words can be measured through
the statistical analysis of corpus data'...'Thus we can establish the most
significant collocates of any given word in the language variety that the
data represents'

*Collocation Strength* between a node *n* and its collocate *c* based on
four observed absolute frequencies in the data i) # of tokens in corpus,
ii) # of *n* tokens, iii) # of *c* tokens, iv) # of tokens where *n* and *c*
occur within a collocation window (certain # of words distance within each
other). Different kinds of definitions of collocation: *purely
statistical/frequency based*, without taking meaning into consideration.
Phraseological tradition *defines collocations as being lexicalised*
(empirical v lexical collocations). Collocations can also be defined
as *multiword
expressions* in computational linguistics.

*Colligation*. Term is more polysemous than even collocation. Can *'describe
syntagmatic attraction between grammatical categories'*. '[M]ost common use
of the term colligation today...is to designate *the attraction between a
lexical item and a grammatical category*'.
- BUDGE attracted to construction [modal auxiliary verb + *budge*], .e.g.,
will/won't budge
- English phrase *naked eye*  is often preceded by a preposition and a
definite article, e.g., *to the naked eye, for the naked eye*

Source: Collocation and colligation, Tomas Lehecka

Hoey: ‘Every word is primed to occur in (or avoid) certain grammatical
functions; these are its colligations.’


'For example, verbs of perception, such as *hear*, *notice*, *see*, *watch*,
tend to be followed by an object and an -ing clause: *I heard you coming in
late last night*. *I saw him playing live when I was in Belgrade*.'
*EXAMPLE *(given in source): Colligational differences between *select *and
'Every word is primed to occur in (or avoid) certain grammatical positions,
and to occur in ( or
avoid) certain grammatical functions; these are its colligations.'

*EXAMPLE*: The word *consequence*
'We find that [the word *consequence*] has a very low likelihood of
appearing as the object of a clause (i.e. following an action or possession
verb) unlike other abstract nouns such as *preference *or *use*. We do not
(perhaps surprisingly) encounter many examples of sentences like the
following: *Unfortunately it also had this tragic consequence that the baby
became grossly bloated.* whereas sentences like *The homeless are asked if
they have a preference.* and *The minister called on schools to make more
use of the colleges’ vocational experience *… are very common.
'*Consequence* occurs as (part of) the object of a clause only four per
cent of the time, whereas *preference *and *use *both occur in this
grammatical position over a third of the time. On the other hand,
consequence occurs as (part of) the complement (i.e. following the verb *be*
or a closely related verb) much more often than is normal for abstract
nouns. In fact it occurs in this grammatical position almost a quarter of
the time, whereas *preference *and *use *occur with this function in less
than one in 14 clauses. So a sentence on the pattern of *It is the natural
consequence of a deep recession *… is extremely common, but sentences such
as *The main one was his preference for force.* or *This is an improper use
of executive power*. are very much the exception rather than the rule. The
aversion of *consequence *for occurring as an object is an example of
negative colligation; its liking for complement position is an example of
positive colligation. Colligations are particularly important to learners
of the language because they explain why it is that a learner may feel he
or she knows a word and yet produce a sentence that is grammatical but ‘not



Il giorno mer 25 nov 2020 alle ore 12:22 Max Ionov <max.ionov@gmail.com> ha

> Dear all,
> this is a gentle reminder for the OntoLex FrAC/Nexus T1.1 lexicon telco
> this Thursday, Nov 26, at 12:00 CET.
> The primary goal of the meeting is to continue elaborating on embeddings,
> collocations and similarity.
> The Google Meet link for this telco is
> https://meet.google.com/rsx-mbkr-oxi
> As usual, agenda/minutes document is:
> https://docs.google.com/document/d/1N2w_r6WLhFGESSMSUkG5FSROorXscDMQuB77qg9uDIA/edit#
> .
> Best regards,
> Max
