Skip to yearly menu bar Skip to main content

Workshop: Agent Learning in Open-Endedness Workshop

From Centralized to Self-Supervised: Pursuing Realistic Multi-Agent Reinforcement Learning

Violet Xiang · Logan Cross · Jan-Philipp Fraenken · Nick Haber

Keywords: [ multi-agent reinforcement learning ] [ intrinsic motivation ] [ Self-supervised learning ]


In real-world environments, autonomous agents rely on their egocentric observations. They must learn adaptive strategies to interact with others who possess mixed motivations, discernible only through visible cues. Several Multi-Agent Reinforcement Learning (MARL) methods adopt centralized approaches that involve either centralized training or reward-sharing, often violating the realistic ways in which living organisms, like animals or humans, process information and interact. MARL strategies deploying decentralized training with intrinsic motivation offer a self-supervised approach, enable agents to develop flexible social strategies through the interaction of autonomous agents. However, by contrasting the self-supervised and centralized methods, we reveal that populations trained with reward-sharing methods surpass those using self-supervised methods in a mixed-motive environment. We link this superiority to specialized role emergence and an agent's expertise in its role. Interestingly, this gap shrinks in pure-motive settings, emphasizing the need for evaluations in more complex, realistic environments (mixed-motive). Our preliminary results suggest a gap in population performance that can be closed by improving self-supervised methods and thereby pushing MARL closer to real-world readiness.

Chat is not available.