Modular Agents from Adam Sobieski on 2023-07-12 (public-webagents@w3.org from July 2023)

From: Adam Sobieski <adamsobieski@hotmail.com>
Date: Wed, 12 Jul 2023 05:26:45 +0000
To: "public-webagents@w3.org" <public-webagents@w3.org>
Message-ID: <PH8P223MB067505CA60BF2F4DE3E4CD04C536A@PH8P223MB0675.NAMP223.PROD.OUTLOOK.COM>

Autonomous Agents on the Web Community Group,

Hello. A recent news article [1] discusses a scientific publication which broaches some interesting "multiagent" systems topics. The scientific publication discusses modular agents, agents which are composed of multiple subagents, each a reinforcement learning agent dedicated to a need of its overarching system.

Having Multiple Selves Helps Learning Agents Explore and Adapt in Complex Changing Worlds (HTML<https://www.pnas.org/doi/abs/10.1073/pnas.2221180120>, closed access)
Zack Dulberg, Rachit Dubey, Isabel M. Berwian, Jonathan D. Cohen

Significance
Adaptive agents must continually satisfy a range of distinct and possibly conflicting needs. In most models of learning, a monolithic agent tries to maximize one value that measures how well it balances its needs. However, this task is difficult when the world is changing and needs are many. Here, we considered an agent as a collection of modules, each dedicated to a particular need and competing for control of action. Compared to the standard monolithic approach, modular agents were much better at maintaining homeostasis of a set of internal variables in simulated environments, both static and changing. These results suggest that having "multiple selves" may represent an evolved solution to the universal problem of balancing multiple needs in changing environments.

Abstract
Satisfying a variety of conflicting needs in a changing environment is a fundamental challenge for any adaptive agent. Here, we show that designing an agent in a modular fashion as a collection of subagents, each dedicated to a separate need,
powerfully enhanced the agent’s capacity to satisfy its overall needs. We used the formalism of deep reinforcement learning to investigate a biologically relevant multiobjective task: continually maintaining homeostasis of a set of physiologic variables. We then conducted simulations in a variety of environments and compared how modular agents performed relative to standard monolithic agents (i.e., agents that aimed to satisfy all needs in an integrated manner using a single aggregate measure of success). Simulations revealed that modular agents a) exhibited a form of exploration that was intrinsic and emergent rather than extrinsically imposed; b) were robust to changes in nonstationary environments, and c) scaled gracefully in their ability to maintain homeostasis as the number of conflicting objectives increased. Supporting analysis suggested that the robustness to changing environments and increasing numbers of needs were due to intrinsic exploration and efficiency of representation afforded by the modular architecture. These results suggest that the normative principles by which agents have adapted to complex changing environments may also explain why humans have long been described as consisting of "multiple selves."

Also, here is a previous publication by the same team:

Modularity Benefits Reinforcement Learning Agents with Competing Homeostatic Drives (PDF<https://arxiv.org/pdf/2204.06608.pdf>, open access, arXiv)
Zack Dulberg, Rachit Dubey, Isabel M. Berwian, Jonathan D. Cohen

Abstract
The problem of balancing conflicting needs is fundamental to intelligence. Standard reinforcement learning algorithms maximize a scalar reward, which requires combining different objective-specific rewards into a single number. Alternatively, different objectives could also be combined at the level of action value, such that specialist modules responsible for different objectives submit different action suggestions to a decision process, each based on rewards that are independent of one another. In this work, we explore the potential benefits of this alternative strategy. We investigate a biologically relevant multi-objective problem, the continual homeostasis of a set of variables, and compare a monolithic deep Q-network to a modular network with a dedicated Q-learner for each variable. We find that the modular agent: a) requires minimal exogenously determined exploration; b) has improved sample efficiency; and c) is more robust to out-of-domain perturbation.

Best regards,
Adam Sobieski

[1] https://singularityhub.com/2023/07/11/ai-agents-with-multiple-selves-can-rapidly-adapt-to-a-changing-world/
[2] https://en.wikipedia.org/wiki/Multi-agent_reinforcement_learning
[3] https://en.wikipedia.org/wiki/Multi-objective_reinforcement_learning

Received on Wednesday, 12 July 2023 05:26:53 UTC