Skip to yearly menu bar Skip to main content

Workshop: Deep Reinforcement Learning Workshop

Model and Method: Training-Time Attack for Cooperative Multi-Agent Reinforcement Learning

Siyang Wu · Tonghan Wang · Xiaoran Wu · Jingfeng ZHANG · Yujing Hu · Changjie Fan · Chongjie Zhang


The robustness of deep cooperative multi-agent reinforcement learning (MARL) is of great concern and limits the application to real-world risk-sensitive tasks. Adversarial attack is a promising direction to study and improve the robustness of MARL but is largely under-studied. Previous work focuses on deploy-time attacks which may exaggerate attack performance because the MARL learner even does not anticipate the attacker. In this paper, we propose training-time attacks where the learner is allowed to observe and adapt to poisoned experience. For the stealthiness of attacks, we contaminate action sampling and restrict the attack budget so that non-adversarial agents cannot distinguish attacks from exploration noise. We derive two specific attack methods by modeling the influence of action-sampling on experience replay and further on team performance. Experiments show that our methods significantly undermine MARL algorithms by subtly disturbing the exploration-exploitation balance during the learning process.

Chat is not available.