Re: Beyond Transformers ... from Dave Raggett on 2024-09-16 (public-cogai@w3.org from September 2024)

From: Dave Raggett <dsr@w3.org>
Date: Mon, 16 Sep 2024 10:45:44 +0100
To: Paola Di Maio <paoladimaio10@gmail.com>
Cc: public-cogai <public-cogai@w3.org>
Message-Id: <E71442C1-8AC7-4E22-96A5-D8A11D37328F@w3.org>
I never said that there is a need to invent more biological intelligence than is already available in nature. Not sure where you got that from. Continual learning would be a major advance on today’s Generative AI. A better understanding of the brain can help us achieve that. Take a look at the papers I cited for more details.

> On 16 Sep 2024, at 10:29, Paola Di Maio <paoladimaio10@gmail.com> wrote:
> 
> Thank you Dave. Okay -
> I suppose I can start by challenging the assumptions for conversation's sake/
> 
> 
> 1, we already have, on this planet, plenty of biological intelligence, everywhere. why do you think there is a need to invent
> more biological intelligence than is already available in nature?
> 2. in the bullet points, which summarise machine learning techniques, it is assumed that these techniques can help to achieve Sentient AI, but perhaps you could point to the evidence that it is so, to start with?
> 3. For those unfamiliar with such concepts, it could help if you could introduce everyone to these terms
> (people may not have time to study the subject in depth before attempting to answer you questions :-)
> Maybe a mini lecture?
> 
> Uh?
> Retrieval with degraded or noisy cues
> Stochastic selection: if there are multiple memories with very similar cues, the probability of retrieving any one of them depends on its level of activationSingle-shot storage rather than requiring repeated presentations of each cue/value pair
> Short and long term memory with a model for boosting and decay
> Minimisation of interference to ensure effective use of memory capacity, e.g. using sparse coding
> 
> On Mon, Sep 16, 2024 at 11:27 AM Dave Raggett <dsr@w3.org <mailto:dsr@w3.org>> wrote:
>> Hi Paola,
>> 
>> The email was pretty clear - what scientific papers can help shed light on biologically plausible computational models of associative memory.  This is motivated in respect to replacing Transformers with solutions that enable continual learning, something that is key to realising Sentient AI.
>> 
>>> On 16 Sep 2024, at 10:11, Paola Di Maio <paola.dimaio@gmail.com <mailto:paola.dimaio@gmail.com>> wrote:
>>> 
>>> Dave, thanks for sharing what seems an important topic
>>> 
>>>  I try to read posts to see how they relate to other things I may be working on
>>> ( betweenness)
>>> I am afraid I cannot offer much by means of commentary without studying in more depth.
>>> I can see that you are asking a question, I suspect it could be easier to help answer it if you could illustrate
>>> a bit more for each point. Where are you coming from? Where lies the motivation for each point? What problems are you trying to solve? What questions are still not answered?  
>>> 
>>>   Maybe there is something in the papers that we are reading that could be pertinent to answer you questions, but as they stand isolated I cannot connect them immediately to what have in hand
>>> P
>>> 
>>>> 
>>> I am looking for the means to enable associative memory to support:
>>> What other papers should we be looking at
>>> Retrieval with degraded or noisy cuesStochastic selection: if there are multiple memories with very similar cues, the probability of retrieving any one of them depends on its level of activationSingle-shot storage rather than requiring repeated presentations of each cue/value pair
>>> Short and long term memory with a model for boosting and decay
>>> Minimisation of interference to ensure effective use of memory capacity, e.g. using sparse coding
>>> 
>>> 
>>> On Mon, Sep 16, 2024 at 11:04 AM Dave Raggett <dsr@w3.org <mailto:dsr@w3.org>> wrote:
>>>> Generative AI assumes that AI models need to be trained upon a representative dataset before being deployed. The AI models are based upon a frozen moment in time. By contrast, humans and other animals learn continually, and this is thought to be based upon continual prediction. In respect to language, this amounts to predicting the next word based upon the preceding words.
>>>> 
>>>> Transformer based language models use an explicit context that contains many thousands of preceding words. A promising alternative is to instead hold the context in an associative memory that maps cues to data. My hunch is that each layer in the abstraction stack can use its own associative memory for attention along with local learning rules based upon continual prediction in each layer, avoiding the need for biologically implausible back propagation across the layers.
>>>> 
>>>> Associative memory is uniquitous in the brain, yet we still don't have a full understanding of how it is implemented. In principle, this could use one or more layers to map the cues to probability distributions for the associated data vectors, enabling the use of argmax to determine the index into a table of data vectors. That suffers from the reduction to a one-hot encoding, i.e. each data vector is selected by a single neuron, which sounds error prone and very unlikely from a biological perspective.
>>>> 
>>>> Some interesting papers on this are:
>>>> 
>>>> Biological constraints on neural network models of cognitive function (2021), Pulvermüller et al.
>>>>  https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7612527/pdf/EMS142176.pdf
>>>> 
>>>> Recurrent predictive coding models for associative memory employing covariance learning (2023), Tang et al. 
>>>>  https://journals.plos.org/ploscompbiol/article?id=10.1371%2Fjournal.pcbi.1010719
>>>> 
>>>> I am looking for the means to enable associative memory to support:
>>>> 
>>>> Retrieval with degraded or noisy cues
>>>> Stochastic selection: if there are multiple memories with very similar cues, the probability of retrieving any one of them depends on its level of activation
>>>> Single-shot storage rather than requiring repeated presentations of each cue/value pair
>>>> Short and long term memory with a model for boosting and decay
>>>> Minimisation of interference to ensure effective use of memory capacity, e.g. using sparse coding
>>>> 
>>>> What other papers should we be looking at?
>>>> 
>>>> Dave Raggett <dsr@w3.org <mailto:dsr@w3.org>>
>>>> 
>>>> 
>>>> 
>> 
>> Dave Raggett <dsr@w3.org <mailto:dsr@w3.org>>
>> 
>> 
>> 

Dave Raggett <dsr@w3.org>
Received on Monday, 16 September 2024 09:45:56 UTC