Timezone: »
Individual rationality, which involves maximizing expected individual return, does not always lead to optimal individual or group outcomes in multi-agent problems. For instance, in social dilemma situations, Reinforcement Learning (RL) agents trained to maximize individual rewards converge to mutual defection that is individually and socially sub-optimal. In contrast, humans evolve individual and socially optimal strategies in such social dilemmas. Inspired by ideas from human psychology that attribute this behavior in humans to the status-quo bias, we present a status-quo loss (SQLoss) and the corresponding policy gradient algorithm that incorporates this bias in an RL agent. We demonstrate that agents trained with SQLoss learn individually as well as socially optimal behavior in several social dilemma matrix games. To apply SQLoss to games where cooperation and defection are determined by a sequence of non-trivial actions, we present GameDistill, an algorithm that reduces a multi-step game with visual input to a matrix game. We empirically show how agents trained with SQLoss on GameDistill reduced version of Coin Game and StagHunt evolve optimal policies. Finally, we show that SQLoss extends to a 4-agent setting by demonstrating the emergence of cooperative behavior in the popular Braess' paradox.
Author Information
Pinkesh Badjatiya (Adobe)
Mausoom Sarkar (Adobe)
Nikaash Puri (Adobe)
Jayakumar Subramanian (Adobe Systems)
Abhishek Sinha (Stanford University)
Siddharth Singh (University of Maryland, College Park)
Balaji Krishnamurthy (Adobe Inc)
More from the Same Authors
-
2021 : What Ails One-Shot Image Segmentation: A Data Perspective »
Mayur Hemani · Abhinav Patel · Tejas Shimpi · Anirudha Ramesh · Balaji Krishnamurthy -
2021 : Interpretable & Hierarchical Topic Models using Hyperbolic Geometry »
Simra Shahid · Tanay Anand · Nikaash Puri · Balaji Krishnamurthy -
2022 : Trajectory-based Explainability Framework for Offline RL »
Shripad Deshmukh · Arpan Dasgupta · Chirag Agarwal · Nan Jiang · Balaji Krishnamurthy · Georgios Theocharous · Jayakumar Subramanian -
2023 : AgentTorch: Agent-based Modeling with Automatic Differentiation »
Ayush Chopra · Jayakumar Subramanian · Balaji Krishnamurthy · Ramesh Raskar -
2023 : Contextual Alchemy: A Framework for Enhanced Readability through Cross-Domain Entity Alignment »
Simra Shahid · Nikitha Srikanth · Surgan Jandial · Balaji Krishnamurthy -
2021 Poster: D2C: Diffusion-Decoding Models for Few-Shot Conditional Generation »
Abhishek Sinha · Jiaming Song · Chenlin Meng · Stefano Ermon -
2021 Poster: Medical Dead-ends and Learning to Identify High-Risk States and Treatments »
Mehdi Fatemi · Taylor Killian · Jayakumar Subramanian · Marzyeh Ghassemi