Concerted effort to more closely emulate the brain ...

It is encouraging to see greater focus by AI researchers on emulating the functioning of the brain. In particular, to look beyond Transformers and explore ways to add memory as a means to expand beyond the limitations of context windows.  Here is a link to recent paper on this from Google Research:

   Titans: Learning to Memorize at Test Time, December 2024, https://arxiv.org/abs/2501.00663

The authors discuss their approach to combining attention with short and long term memory.  Their approach uses what they call a deep memory module, as they consider a single vector or matrix to be inadequate.  Their short term memory model incorporates the notions of surprise and decay. Surprise measures the extent to which an input violates expectations, making it more memorable, whilst decay ensures that recall focuses on memories that are found to be more useful.

A distinction is made between long term and persistent memory, where the latter deals with task knowledge that is essentially input independent.  These two kinds of memory are implemented as different forms of feedforward networks.

The authors evaluate three different ways to integrate memory into the language model. The first approach breaks the input sequence into fixed size chunks, and treats memory as a context to the current information. The second approach replaces chunking by a sliding window. The third approach uses memory to compress the past and current context before applying attention.

Whilst the authors have done a good job on comparing the performance of their approach to other work on long context windows, there is little attempt to relate it to studies of human memory and task performance. My hunch is that there is plenty to be gained from interdisciplinary perspectives on AI and the Cognitive Sciences.

One challenge is how to support privacy and confidentiality for memory enhanced language models. In essence, there needs to be a distinction between private memories held on behalf of a given user and public memories, e.g. based on today's news and public sources.  This would enable personal agents that act on behalf of their users and embody their users’ preferences.  Your personal agent wouldn’t need to ask you for information it already knows about you, resulting in a streamlined user experience.  An obvious refinement would consider a middle ground for the knowledge that you are a paid subscriber for. We thus need AI architectures that can combine these three types of knowledge effectively and securely.  A further consideration is how to enable transparent open markets of services that support such personal agents.

Dave Raggett <dsr@w3.org>

Received on Thursday, 23 January 2025 11:12:58 UTC