Workshop: Workshop on Machine Learning Safety

c-MBA: Adversarial Attack for Cooperative MARL Using Learned Dynamics Model

Nhan H Pham · Lam Nguyen · Jie Chen · Thanh Lam Hoang · Subhro Das · Lily Weng


In recent years, a proliferation of methods were developed for cooperative multi-agent reinforcement learning (c-MARL). However, the robustness of c-MARL agents against adversarial attacks has been rarely explored. In this paper, we propose to evaluate the robustness of c-MARL agents via a model-based approach, named \textbf{c-MBA}. Our proposed attack can craft much stronger adversarial state perturbations of c-MARL agents to lower total team rewards than existing model-free approaches. Our numerical experiments on two representative MARL benchmarks illustrate the advantage of our approach over other baselines: our model-based attack consistently outperforms other baselines in all tested environments.

