Re: Beyond Transformers ...

Thanks, I need to find the time to study these. will try to bend my
thinking that way

On Mon, Sep 16, 2024 at 11:45 AM Dave Raggett <dsr@w3.org> wrote:

> I never said that there is a need to invent more biological intelligence
> than is already available in nature. Not sure where you got that from.
> Continual learning would be a major advance on today’s Generative AI. A
> better understanding of the brain can help us achieve that. Take a look at
> the papers I cited for more details.
>
> On 16 Sep 2024, at 10:29, Paola Di Maio <paoladimaio10@gmail.com> wrote:
>
> Thank you Dave. Okay -
> I suppose I can start by challenging the assumptions for conversation's
> sake/
>
>
> 1, we already have, on this planet, plenty of biological intelligence,
> everywhere. why do you think there is a need to invent
> more biological intelligence than is already available in nature?
> 2. in the bullet points, which summarise machine learning techniques, it
> is assumed that these techniques can help to achieve Sentient AI, but
> perhaps you could point to the evidence that it is so, to start with?
> 3. For those unfamiliar with such concepts, it could help if you could
> introduce everyone to these terms
> (people may not have time to study the subject in depth before attempting
> to answer you questions :-)
> Maybe a mini lecture?
>
> Uh?
>
>    -
>
>    Retrieval with degraded or noisy cues
>
>    -
>
>    Stochastic selection: if there are multiple memories with very similar
>    cues, the probability of retrieving any one of them depends on its level of
>    activationSingle-shot storage rather than requiring repeated presentations
>    of each cue/value pair
>
>    - Short and long term memory with a model for boosting and decay
>    - Minimisation of interference to ensure effective use of memory
>    capacity, e.g. using sparse coding
>
>
> On Mon, Sep 16, 2024 at 11:27 AM Dave Raggett <dsr@w3.org> wrote:
>
>> Hi Paola,
>>
>> The email was pretty clear - what scientific papers can help shed light
>> on biologically plausible computational models of associative memory.  This
>> is motivated in respect to replacing Transformers with solutions that
>> enable continual learning, something that is key to realising Sentient AI.
>>
>> On 16 Sep 2024, at 10:11, Paola Di Maio <paola.dimaio@gmail.com> wrote:
>>
>> Dave, thanks for sharing what seems an important topic
>>
>>  I try to read posts to see how they relate to other things I may be
>> working on
>> (* betweenness*)
>> I am afraid I cannot offer much by means of commentary without studying
>> in more depth.
>> I can see that you are asking a question, I suspect it could be easier to
>> help answer it if you could illustrate
>> a bit more for each point. Where are you coming from? Where lies the
>> motivation for each point? What problems are you trying to solve? What
>> questions are still not answered?
>>
>>   Maybe there is something in the papers that we are reading that could
>> be pertinent to answer you questions, but as they stand isolated I cannot
>> connect them immediately to what have in hand
>> P
>>
>>
>>> I am looking for the means to enable associative memory to support:
>>
>> What other papers should we be looking at
>>
>>
>>    -
>>
>>    Retrieval with degraded or noisy cuesStochastic selection: if there
>>    are multiple memories with very similar cues, the probability of retrieving
>>    any one of them depends on its level of activationSingle-shot storage
>>    rather than requiring repeated presentations of each cue/value pair
>>
>>    - Short and long term memory with a model for boosting and decay
>>    - Minimisation of interference to ensure effective use of memory
>>    capacity, e.g. using sparse coding
>>
>>
>>
>> On Mon, Sep 16, 2024 at 11:04 AM Dave Raggett <dsr@w3.org> wrote:
>>
>>> Generative AI assumes that AI models need to be trained upon a
>>> representative dataset before being deployed. The AI models are based upon
>>> a frozen moment in time. By contrast, humans and other animals learn
>>> continually, and this is thought to be based upon continual prediction. In
>>> respect to language, this amounts to predicting the next word based upon
>>> the preceding words.
>>>
>>> Transformer based language models use an explicit context that contains
>>> many thousands of preceding words. A promising alternative is to instead
>>> hold the context in an associative memory that maps cues to data. My hunch
>>> is that each layer in the abstraction stack can use its own associative
>>> memory for attention along with local learning rules based upon continual
>>> prediction in each layer, avoiding the need for biologically implausible
>>> back propagation across the layers.
>>>
>>> Associative memory is uniquitous in the brain, yet we still don't have a
>>> full understanding of how it is implemented. In principle, this could use
>>> one or more layers to map the cues to probability distributions for the
>>> associated data vectors, enabling the use of *argmax* to determine the
>>> index into a table of data vectors. That suffers from the reduction to a
>>> one-hot encoding, i.e. each data vector is selected by a single neuron,
>>> which sounds error prone and very unlikely from a biological perspective.
>>>
>>> Some interesting papers on this are:
>>>
>>> *Biological constraints on neural network models of cognitive function*
>>> (2021), Pulvermüller et al.
>>> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7612527/pdf/EMS142176.pdf
>>>
>>> *Recurrent predictive coding models for associative memory employing
>>> covariance learning* (2023), Tang et al.
>>>
>>> https://journals.plos.org/ploscompbiol/article?id=10.1371%2Fjournal.pcbi.1010719
>>>
>>> I am looking for the means to enable associative memory to support:
>>>
>>>
>>>    - Retrieval with degraded or noisy cues
>>>    - Stochastic selection: if there are multiple memories with very
>>>    similar cues, the probability of retrieving any one of them depends on its
>>>    level of activation
>>>    - Single-shot storage rather than requiring repeated presentations
>>>    of each cue/value pair
>>>    - Short and long term memory with a model for boosting and decay
>>>    - Minimisation of interference to ensure effective use of memory
>>>    capacity, e.g. using sparse coding
>>>
>>>
>>> What other papers should we be looking at?
>>>
>>> Dave Raggett <dsr@w3.org>
>>>
>>>
>>>
>>>
>> Dave Raggett <dsr@w3.org>
>>
>>
>>
>>
> Dave Raggett <dsr@w3.org>
>
>
>
>

Received on Monday, 16 September 2024 10:04:30 UTC