Timezone: »
This work studies concept-based interpretability in the context of multi-agent learning. Unlike supervised learning, where there have been efforts to understand a model's decisions, multi-agent interpretability remains under-investigated. This is in part due to the increased complexity of the multi-agent setting---interpreting the decisions of multiple agents over time is combinatorially more complex than understanding individual, static decisisons---but is also a reflection of the limited availability of tools for understanding multi-agent behavior. Interactions between agents, and coordination generally, remain difficult to gauge in MARL. In this work, we propose Concept Bottleneck Policies (CBPs) as a method for learning intrinsically interpretable, concept-based policies with MARL. We demonstrate that, by conditioning each agent's action on a set of human-understandable concepts, our method enables post-hoc behavioral analysis via concept intervention that is infeasible with standard policy architectures. Experiments show that concept interventions over CBPs reliably detect when agents have learned to coordinate with each other in environments that do not demand coordination, and detect those environments in which coordination is required. Moreover, we find evidence that CBPs can detect coordination failures (such as lazy agents) and expose the low-level inter-agent information that underpins emergent coordination. Finally, we demonstrate that our approach matches the performance of standard, non-concept-based policies; thereby achieving interpretability without sacrificing performance.
Author Information
Niko Grupen (Cornell University)
Shayegan Omidshafiei (Google)
Natasha Jaques (Google Brain, UC Berkeley)
Natasha Jaques holds a joint position as a Research Scientist at Google Brain and Postdoctoral Fellow at UC Berkeley. Her research focuses on Social Reinforcement Learning in multi-agent and human-AI interactions. Natasha completed her PhD at MIT, where her thesis received the Outstanding PhD Dissertation Award from the Association for the Advancement of Affective Computing. Her work has also received Best Demo at NeurIPS, an honourable mention for Best Paper at ICML, Best of Collection in the IEEE Transactions on Affective Computing, and Best Paper at the NeurIPS workshops on ML for Healthcare and Cooperative AI. She has interned at DeepMind, Google Brain, and was an OpenAI Scholars mentor. Her work has been featured in Science Magazine, Quartz, MIT Technology Review, Boston Magazine, and on CBC radio. Natasha earned her Masters degree from the University of British Columbia, and undergraduate degrees in Computer Science and Psychology from the University of Regina.
Been Kim (Google Brain)
More from the Same Authors
-
2021 : Advanced Methods for Connectome-Based Predictive Modeling of Human Intelligence: A Novel Approach Based on Individual Differences in Cortical Topography »
Evan Anderson · Anuj Nayak · Pablo Robles-Granda · Lav Varshney · Been Kim · Aron K Barbey -
2022 : Multi-Agent Reinforcement Learning for Microprocessor Design Space Exploration »
Srivatsan Krishnan · Natasha Jaques · Shayegan Omidshafiei · Dan Zhang · Izzeddin Gur · Vijay Janapa Reddi · Aleksandra Faust -
2022 : In the ZONE: Measuring difficulty and progression in curriculum generation »
Rose Wang · Jesse Mu · Dilip Arumugam · Natasha Jaques · Noah Goodman -
2022 : Panel: Explainability/Predictability Robotics (Q&A 4) »
Katherine Driggs-Campbell · Been Kim · Leila Takayama -
2022 : Panel Discussion »
Kamalika Chaudhuri · Been Kim · Dorsa Sadigh · Huan Zhang · Linyi Li -
2022 : Natasha Jaques »
Natasha Jaques -
2022 : Invited Talk: Been Kim »
Been Kim -
2022 : Multi-Agent Reinforcement Learning for Microprocessor Design Space Exploration »
Srivatsan Krishnan · Natasha Jaques · Shayegan Omidshafiei · Dan Zhang · Izzeddin Gur · Vijay Janapa Reddi · Aleksandra Faust -
2022 Poster: Beyond Rewards: a Hierarchical Perspective on Offline Multiagent Behavioral Analysis »
Shayegan Omidshafiei · Andrei Kapishnikov · Yannick Assogba · Lucas Dixon · Been Kim -
2021 : Cooperative Multi-Agent Fairness and Equivariant Policies »
Niko Grupen · Bart Selman · Daniel Lee -
2020 Poster: Real World Games Look Like Spinning Tops »
Wojciech Czarnecki · Gauthier Gidel · Brendan Tracey · Karl Tuyls · Shayegan Omidshafiei · David Balduzzi · Max Jaderberg -
2019 : Invited talk #5 »
Been Kim -
2019 : Responsibilities »
Been Kim · Liz O'Sullivan · Friederike Schuur · Andrew Smart · Jacob Metcalf -
2019 Poster: Multiagent Evaluation under Incomplete Information »
Mark Rowland · Shayegan Omidshafiei · Karl Tuyls · Julien Perolat · Michal Valko · Georgios Piliouras · Remi Munos -
2019 Spotlight: Multiagent Evaluation under Incomplete Information »
Mark Rowland · Shayegan Omidshafiei · Karl Tuyls · Julien Perolat · Michal Valko · Georgios Piliouras · Remi Munos -
2017 : Invited Talk 1 »
Been Kim -
2016 Workshop: Interpretable Machine Learning for Complex Systems »
Andrew Wilson · Been Kim · William Herlands -
2016 Oral: Examples are not enough, learn to criticize! Criticism for Interpretability »
Been Kim · Sanmi Koyejo · Rajiv Khanna -
2016 Poster: Examples are not enough, learn to criticize! Criticism for Interpretability »
Been Kim · Sanmi Koyejo · Rajiv Khanna -
2015 Poster: Mind the Gap: A Generative Approach to Interpretable Feature Selection and Extraction »
Been Kim · Julie A Shah · Finale Doshi-Velez -
2014 Poster: The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification »
Been Kim · Cynthia Rudin · Julie A Shah