Timezone: »

Shield Decentralization for Safe Multi-Agent Reinforcement Learning
Daniel Melcer · Christopher Amato · Stavros Tripakis

Wed Nov 30 02:00 PM -- 04:00 PM (PST) @ Hall J #613

Learning safe solutions is an important but challenging problem in multi-agent reinforcement learning (MARL). Shielded reinforcement learning is one approach for preventing agents from choosing unsafe actions. Current shielded reinforcement learning methods for MARL make strong assumptions about communication and full observability. In this work, we extend the formalization of the shielded reinforcement learning problem to a decentralized multi-agent setting. We then present an algorithm for decomposition of a centralized shield, allowing shields to be used in such decentralized, communication-free environments. Our results show that agents equipped with decentralized shields perform comparably to agents with centralized shields in several tasks, allowing shielding to be used in environments with decentralized training and execution for the first time.

Author Information

Daniel Melcer (Northeastern University)
Christopher Amato (Northeastern University)
Stavros Tripakis (Northeastern University)

More from the Same Authors