Timezone: »
Groups of humans are often able to find ways to cooperate with one another in complex, temporally extended social dilemmas. Models based on behavioral economics are only able to explain this phenomenon for unrealistic stateless matrix games. Recently, multi-agent reinforcement learning has been applied to generalize social dilemma problems to temporally and spatially extended Markov games. However, this has not yet generated an agent that learns to cooperate in social dilemmas as humans do. A key insight is that many, but not all, human individuals have inequity averse social preferences. This promotes a particular resolution of the matrix game social dilemma wherein inequity-averse individuals are personally pro-social and punish defectors. Here we extend this idea to Markov games and show that it promotes cooperation in several types of sequential social dilemma, via a profitable interaction with policy learnability. In particular, we find that inequity aversion improves temporal credit assignment for the important class of intertemporal social dilemmas. These results help explain how large-scale cooperation may emerge and persist.
Author Information
Edward Hughes (DeepMind)
Joel Leibo (DeepMind)
Matthew Phillips (DeepMind)
Karl Tuyls (DeepMind)
Edgar Dueñez-Guzman (DeepMind)
Antonio García Castañeda (DeepMind)
Iain Dunning (DeepMind)
Tina Zhu (DeepMind)
Kevin McKee (DeepMind)
Raphael Koster (DeepMind)
Heather Roff (DeepMind)
Thore Graepel (DeepMind)
More from the Same Authors
-
2021 Spotlight: Collaborating with Humans without Human Data »
DJ Strouse · Kevin McKee · Matt Botvinick · Edward Hughes · Richard Everett -
2021 : Hidden Agenda: a Social Deduction Game with Diverse Learned Equilibria »
Kavya Kopparapu · Edgar Dueñez-Guzman · Jayd Matyas · Alexander Vezhnevets · John Agapiou · Kevin McKee · Richard Everett · Janusz Marecki · Joel Leibo · Thore Graepel -
2022 : Human-AI Interaction in Selective Prediction Systems »
Elizabeth Bondi-Kelly · Raphael Koster · Hannah Sheahan · Martin Chadwick · Yoram Bachrach · Taylan Cemgil · Ulrich Paquet · Krishnamurthy Dvijotham -
2023 Competition: Melting Pot Contest »
Rakshit Trivedi · Akbir Khan · Jesse Clifton · Lewis Hammond · John Agapiou · Edgar Dueñez-Guzman · Jayd Matyas · Dylan Hadfield-Menell · Joel Leibo -
2022 : Advancing the participatory approach to AI in Mental Health »
Wilson Lee · Munmun De Choudhury · Morgan Scheuerman · Julia Hamer-Hunt · Dan Joyce · Nenad Tomasev · Kevin McKee · Shakir Mohamed · Danielle Belgrave · Christopher Burr -
2022 Workshop: Empowering Communities: A Participatory Approach to AI for Mental Health »
Andrey Kormilitzin · Dan Joyce · Nenad Tomasev · Kevin McKee -
2022 : Opening remarks and welcome »
Andrey Kormilitzin · Dan Joyce · Nenad Tomasev · Kevin McKee -
2022 Poster: Turbocharging Solution Concepts: Solving NEs, CEs and CCEs with Neural Equilibrium Solvers »
Luke Marris · Ian Gemp · Thomas Anthony · Andrea Tacchetti · Siqi Liu · Karl Tuyls -
2021 Workshop: Cooperative AI »
Natasha Jaques · Edward Hughes · Jakob Foerster · Noam Brown · Kalesha Bullard · Charlotte Smith -
2021 : Welcome and Opening Remarks »
Edward Hughes · Natasha Jaques -
2021 Poster: Collaborating with Humans without Human Data »
DJ Strouse · Kevin McKee · Matt Botvinick · Edward Hughes · Richard Everett -
2020 : Q&A: Open Problems in Cooperative AI with Thore Graepel (DeepMind), Allan Dafoe (University of Oxford), Yoram Bachrach (DeepMind), and Natasha Jaques (Google) [moderator] »
Thore Graepel · Yoram Bachrach · Allan Dafoe · Natasha Jaques -
2020 : Open Problems in Cooperative AI: Thore Graepel (DeepMind) and Allan Dafoe (University of Oxford) »
Thore Graepel · Allan Dafoe -
2020 Workshop: Cooperative AI »
Thore Graepel · Dario Amodei · Vincent Conitzer · Allan Dafoe · Gillian Hadfield · Eric Horvitz · Sarit Kraus · Kate Larson · Yoram Bachrach -
2020 Poster: Learning to Incentivize Other Learning Agents »
Jiachen Yang · Ang Li · Mehrdad Farajtabar · Peter Sunehag · Edward Hughes · Hongyuan Zha -
2020 Poster: Learning to Play No-Press Diplomacy with Best Response Policy Iteration »
Thomas Anthony · Tom Eccles · Andrea Tacchetti · János Kramár · Ian Gemp · Thomas Hudson · Nicolas Porcel · Marc Lanctot · Julien Perolat · Richard Everett · Satinder Singh · Thore Graepel · Yoram Bachrach -
2020 Spotlight: Learning to Play No-Press Diplomacy with Best Response Policy Iteration »
Thomas Anthony · Tom Eccles · Andrea Tacchetti · János Kramár · Ian Gemp · Thomas Hudson · Nicolas Porcel · Marc Lanctot · Julien Perolat · Richard Everett · Satinder Singh · Thore Graepel · Yoram Bachrach -
2020 Poster: Real World Games Look Like Spinning Tops »
Wojciech Czarnecki · Gauthier Gidel · Brendan Tracey · Karl Tuyls · Shayegan Omidshafiei · David Balduzzi · Max Jaderberg -
2019 Poster: Generalization of Reinforcement Learners with Working and Episodic Memory »
Meire Fortunato · Melissa Tan · Ryan Faulkner · Steven Hansen · Adrià Puigdomènech Badia · Gavin Buttimore · Charles Deck · Joel Leibo · Charles Blundell -
2019 Poster: Biases for Emergent Communication in Multi-agent Reinforcement Learning »
Tom Eccles · Yoram Bachrach · Guy Lever · Angeliki Lazaridou · Thore Graepel -
2019 Poster: Multiagent Evaluation under Incomplete Information »
Mark Rowland · Shayegan Omidshafiei · Karl Tuyls · Julien Perolat · Michal Valko · Georgios Piliouras · Remi Munos -
2019 Spotlight: Multiagent Evaluation under Incomplete Information »
Mark Rowland · Shayegan Omidshafiei · Karl Tuyls · Julien Perolat · Michal Valko · Georgios Piliouras · Remi Munos -
2019 Poster: Interval timing in deep reinforcement learning agents »
Ben Deverett · Ryan Faulkner · Meire Fortunato · Gregory Wayne · Joel Leibo -
2018 Poster: Actor-Critic Policy Optimization in Partially Observable Multiagent Environments »
Sriram Srinivasan · Marc Lanctot · Vinicius Zambaldi · Julien Perolat · Karl Tuyls · Remi Munos · Michael Bowling -
2018 Poster: Re-evaluating evaluation »
David Balduzzi · Karl Tuyls · Julien Perolat · Thore Graepel -
2017 Poster: A multi-agent reinforcement learning model of common-pool resource appropriation »
Julien Pérolat · Joel Leibo · Vinicius Zambaldi · Charles Beattie · Karl Tuyls · Thore Graepel -
2017 Poster: A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning »
Marc Lanctot · Vinicius Zambaldi · Audrunas Gruslys · Angeliki Lazaridou · Karl Tuyls · Julien Perolat · David Silver · Thore Graepel -
2016 : Concluding Remarks »
Thore Graepel · Frans Oliehoek · Karl Tuyls -
2016 : Introduction »
Thore Graepel · Karl Tuyls · Frans Oliehoek -
2016 Workshop: Learning, Inference and Control of Multi-Agent Systems »
Thore Graepel · Marc Lanctot · Joel Leibo · Guy Lever · Janusz Marecki · Frans Oliehoek · Karl Tuyls · Vicky Holgate -
2016 Poster: Using Fast Weights to Attend to the Recent Past »
Jimmy Ba · Geoffrey E Hinton · Volodymyr Mnih · Joel Leibo · Catalin Ionescu -
2016 Oral: Using Fast Weights to Attend to the Recent Past »
Jimmy Ba · Geoffrey E Hinton · Volodymyr Mnih · Joel Leibo · Catalin Ionescu