LLMs are evolving to mimic human cognitive science ...

Until recently the way to improve LLMs was to increase their training data and increase their context window (the number of tokens permitted in the prompt).

That is now changing with a transition to hierarchical architectures that separate thinking from knowing and take inspiration from the cognitive sciences. Some key recent advances include DeepSeek’s Engrams [1], Google Research’s Titans + MIRAS [2], Mosaic Research’s MemAlign [3], hierarchical memory like CAMELoT [4], and Larimar which mimics the Hippocampus for single shot learning [5].

RAG with vector indexes allow search by semantic similarity, enabling LLMs to scan resources that weren’t in their training materials.  We can go further by mimicking how humans use written records and catalogs to supplement fallible memory, enabling robust counting and aggregation, something that is tough for native LLMs. This involves neurosymbolic systems, bridging the worlds of neural AI and the semantic Web.

If we want personal agents that get to know us over many interactions, one approach is for the agent to maintain summary notes that describe you as an individual. When you interact with the agent, your information is injected into the prompt so that the agent appears to remember you.   Personal agents can also be given privileges to access your email, social media and resources on your personal devices, and to perform certain operations on your behalf.

Prompt injection is constrained by the size of the context window.  This where newer approaches to memory can make a big difference.  One challenge is how to manage long term personalised semantic and episodic memories with plenty of implications for privacy, security and trust. The LLM run-time combines your personalised memories with shared knowledge common to all users.

My hunch is that much smaller models will be sufficient for many purposes, and have the advantage of running locally in your personal devices, thereby avoiding the need to transfer personal information to the cloud. Local agents could chat with more powerful cloud-based agents when appropriate, e.g. to access ecosystems of services, and to access knowledge beyond the local agent’s capabilities.

The challenge is to ensure that such local agents are based upon open standards and models, rather than being highly proprietary, locking each of us in a particular company's embrace. That sounds like a laudable goal for the Cognitive AI Community Group to work on!

[1] https://deepseek.ai/blog/deepseek-engram-v4-architecture
[2] https://research.google/blog/titans-miras-helping-ai-have-long-term-memory/ 
[3] https://www.databricks.com/blog/memalign-building-better-llm-judges-human-feedback-scalable-memory
[4] https://arxiv.org/abs/2402.13449
[5] https://arxiv.org/html/2403.11901v1


Dave Raggett <dsr@w3.org>

Received on Thursday, 5 February 2026 11:32:39 UTC