Re: Bootstrapping cognitive NLP

Hi Dave,

This points to the importance of phrase structure in respect to
> co-occurrence statistics, and opportunities for exploring different ways to
> capture them. I am also intrigued by the possibilities for combining
> statistical approaches with inference over meanings with a view to
> “explaining” the intent of each word in a given utterance. This assumes
> eschewing formal semantics in favour of informal approaches to meaning in
> terms of operations over graphs.
>

There is a lot of fascinating work on formal semantics in natural language,
or there has been, at least. I think I mentioned SDRT at some occasions,
but there are more linguistically oriented approaches such as generative
semantics, and of course, the seminal work of Montague. Regardless of the
formalism, the problem is to get it to scale, as we lack the necessary
background knowledge for the reasoning tasks. So, for the foreseeable
future, it will work nicely on small-scale, closed domains for which the
necessary resources can be constructed in realistic time. For everything
else we need something more robust. Hence the preference of graphs over
formulas (and later, the preference of neural networks over *any* kind of
symbolic representation). The alternative is to take the shallow analyses
that SOTA tools give us and transform it into something that looks like a
formal representation. This has been done with Boxer (
https://aclanthology.org/W15-1841.pdf). But note that despite looking like
a logical formula, the resulting analyses are not disambiguated against the
text context but only within the sentence. So, they represent a possible
logical interpretation but not necessarily the correct one.


> Do you know how WordNet was created?  Was it a mainly manual effort by
> skilled lexicographers or did they make use of machine learning?
>

Princeton WN: Manually by skilled lexicographers (and psychologists,
because it was meant to be used for studying priming).
Other languages: Varying, but many starting with a copy of Princeton design
and sense inventories.

Best,
Christian


> Best regards,
>
> Dave
>
> On 14 Jul 2021, at 12:06, Christian Chiarcos <christian.chiarcos@gmail.com>
> wrote:
>
> Hi Dave, dear all,
>
> apologies for not following up too closely. I'm having some administrative
> trouble since a few months, and until this is overcome, I would switch to
> lurking mode, mostly. (Well, I did so already.)
>
> By mainstream NLP researchers, Word Sense Disambiguation is considered a
> hard, but largely artificial problem because there is too little agreement
> on sense definitions across resources and too little sense-annotated
> resources available to apply machine learning in a meaningful way. The
> classical Lesk algorithm seems to be reminiscent of your ideas, and it
> works nicely -- as long as examples and definitions provided in the sense
> inventory are sufficiently representative (which they are not). Anyway, you
> might want to replicate Lesk as proof of principle. It's still considered a
> seminal work: https://dl.acm.org/doi/10.1145/318723.318728. This uses
> word overlap and suffers from data sparsity. A more modern approach
> following Lesk's spirit would probably be to induce embeddings for word
> senses (cf. https://aclanthology.org/P15-1173/, they call word senses
> "lexemes"), and then to compare them with the (aggregate) context
> embeddings. This operates on word embeddings; not sure how to scale this to
> contextualized embeddings as those produced by BERT etc. -- BERT would be
> great to derive "real" sense embeddings if we had a significant corpus
> annotated for word senses. Well, we don't really have that. (OntoNotes [
> https://catalog.ldc.upenn.edu/LDC2013T19] is the closest thing, but they
> had to simplify WordNet sense distinctions in order to annotate them in a
> reliable way.)
>
> As for cognitive plausibility, Lesk isn't incremental, so its way of
> processing is different from what humans do. But the underlying mechanism
> follows a similar intuition as you had, and it would be possible to make it
> incremental by just looking into the preceding context. However, it doesn't
> have a backtracking mechanism, and that would be needed unless we're happy
> with all text-initial (context-free!) words being misclassified.
>
> As for the machine-readable dictionaries, there is very limited data
> available with proper sense definitions. WordNets are (
> http://compling.hss.ntu.edu.sg/omw/). Maybe the Apertium data would work
> for you (
> https://github.com/acoli-repo/acoli-dicts/tree/master/stable/apertium/apertium-rdf-2020-03-18).
> It doesn't have sense definitions, but just assumes to have one sense per
> translation pair.
>
> Best,
> Christian
>
> Best,
> Christian
>
> Am Mo., 12. Juli 2021 um 15:50 Uhr schrieb Dave Raggett <dsr@w3.org>:
>
>> If anyone is has time today I would like to chat about ideas for working
>> on cognitive natural language understanding (NLU).
>>
>> There has been a lot of coverage around BERT and GPT-3 for NLP with their
>> impressive ability for generating text as a continuation to a text passage
>> provided by the user. Unfortunately the hype is overblown, as the lack of
>> real semantics is soon apparent when you ask for the sum of two large
>> numbers, or who is the US President in 1650 (before the United States was
>> founded). GPT-3 doesn't know the limitations of its knowledge and fails to
>> say it doesn't know the answer to questions.
>>
>> I am interested in ways to bootstrap NLU using statistical analysis of
>> text corpora in conjunction with machine readable natural language
>> dictionaries, WordNet’s thesaurus, and manually provided taxonomic
>> knowledge.
>>
>> The starting point is to be able to tag words with their part of speech,
>> e.g. adjective, noun, verb. This enables loose parsing to identify phrase
>> structures, which in turn can be used for co-occurrence statistics. By
>> matching the statistics for a given text passage to dictionary definitions,
>> we can using this to predict word senses in context.
>>
>> This can be considerably improved by introducing knowledge about the
>> relationship between words with related meanings from thesauri and
>> taxonomies, e.g. knowing that dogs are animals helps with a dictionary
>> definition for “collar” expressed in terms of animals, as it explains the
>> use of “dog collar” etc.
>>
>> My hunch is that combining multiple kinds of information in this way can
>> support semantic understanding provided that that is expressed in terms of
>> word senses and human-like reasoning. It may leave ambiguities where agent
>> is unsure, e.g. how do you know that dog is a subclass of animal rather
>> than a related peer concept? However, this still speeds learning through
>> the role of prior knowledge.
>>
>> Researchers have found that we learn associations between concepts whose
>> labels directly co-occur, and subsequently between taxonomically related
>> concepts whose labels share patterns of co-occurrence. Children are good at
>> the former, but poor at the latter, whilst adults are good at both.
>>
>> The challenge is to turn these high level ideas into concrete experiments
>> with running code. A related challenge is to obtain machine interpretable
>> natural language dictionaries.
>>
>> Updated call details are given at:
>>
>> https://lists.w3.org/Archives/Member/internal-cogai/2021Jun/0000.html
>>
>> Looking forward to talking with you!
>>
>> Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett
>> W3C Data Activity Lead & W3C champion for the Web of things
>>
>>
>>
>>
>>
> Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett
> W3C Data Activity Lead & W3C champion for the Web of things
>
>
>
>
>

Received on Wednesday, 14 July 2021 16:17:29 UTC