Re: Bootstrapping cognitive NLP

Hi Christian,

Thanks for your feedback. The evidence is that humans don’t reason using formal semantics, but rather with examples, prior experience and analogies. See e.g. the work by Philip Johnson-Laird. Moreover, humans don’t need to read vast numbers of books to master natural language. Symbols are an abstraction above that of neural networks. See Chris Eliasmith’s work on semantic pointers and pulsed neural networks and the role of circular convolution across high dimensional spaces. The question is how to combine co-occurrence statistics and symbolic inference, and what is the best way to implement this. That is for experimental work to reveal. Following David Marr’s insights, we can distinguish functional requirements from the underlying implementations.

I completely agree that manual knowledge engineering won’t scale, which is why I want to explore how to mimic human learning.This can be broken down into a sequence of research questions. A good starting point is to explore ways to collect and process statistics for thematic and taxonomic knowledge. This can be guided by the way that humans seek to make sense of every utterance they hear or read. I look forward to reporting my progress for better or worse.

p.s. in case I wasn’t clear the idea is for semantic processing of utterances to produce graphs that reflect the meaning intended by the speaker. These graphs are created using a combination of statistics and inference, using concurrent processing at each stage of the NLU pipeline.

Best regards,

Dave

> On 14 Jul 2021, at 17:15, Christian Chiarcos <christian.chiarcos@gmail.com> wrote:
> 
> Hi Dave,
> 
> This points to the importance of phrase structure in respect to co-occurrence statistics, and opportunities for exploring different ways to capture them. I am also intrigued by the possibilities for combining statistical approaches with inference over meanings with a view to “explaining” the intent of each word in a given utterance. This assumes eschewing formal semantics in favour of informal approaches to meaning in terms of operations over graphs.
> 
> There is a lot of fascinating work on formal semantics in natural language, or there has been, at least. I think I mentioned SDRT at some occasions, but there are more linguistically oriented approaches such as generative semantics, and of course, the seminal work of Montague. Regardless of the formalism, the problem is to get it to scale, as we lack the necessary background knowledge for the reasoning tasks. So, for the foreseeable future, it will work nicely on small-scale, closed domains for which the necessary resources can be constructed in realistic time. For everything else we need something more robust. Hence the preference of graphs over formulas (and later, the preference of neural networks over *any* kind of symbolic representation). The alternative is to take the shallow analyses that SOTA tools give us and transform it into something that looks like a formal representation. This has been done with Boxer (https://aclanthology.org/W15-1841.pdf <https://aclanthology.org/W15-1841.pdf>). But note that despite looking like a logical formula, the resulting analyses are not disambiguated against the text context but only within the sentence. So, they represent a possible logical interpretation but not necessarily the correct one.
>  
> Do you know how WordNet was created?  Was it a mainly manual effort by skilled lexicographers or did they make use of machine learning?
> 
> Princeton WN: Manually by skilled lexicographers (and psychologists, because it was meant to be used for studying priming).
> Other languages: Varying, but many starting with a copy of Princeton design and sense inventories.
> 
> Best,
> Christian

Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett
W3C Data Activity Lead & W3C champion for the Web of things 

Received on Saturday, 17 July 2021 21:06:36 UTC