Timezone: »
Multi-agent AI research promises a path to develop human-like and human-compatible intelligent technologies that complement the solipsistic view of other approaches, which mostly do not consider interactions between agents. We propose a Cooperative AI contest based on the Melting Pot framework. At its core, Melting Pot provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios. There exist several benchmarks, challenges, and contests aimed at spurring research on cooperation in multi-agent learning. Melting Pot expands and generalizes these previous efforts in several ways: (1) it focuses on mixed-motive games, (as opposed to purely cooperative or competitive games); (2) it enables testing generalizability of agent cooperation to previously unseen coplayers; (3) it consists of a suite of multiple environments rather than a single one; and (4) it includes games with larger numbers of players (> 7). These properties make it an accessible while also challenging framework for multi-agent AI research. For this contest, we invite multi-agent reinforcement learning solutions that focus on driving cooperation between interacting agents in the Melting Pot environments and generalize to new situations beyond training. A scoring mechanism based on metrics representative of cooperative intelligence will be used to measure success of the solutions. We believe that Melting Pot can serve as a clear benchmark to drive progress on Cooperative AI, as it focuses specifically on evaluating social intelligence of both groups and individuals. As an overarching goal, we are excited in assessing the implications of current definitions of cooperative intelligence on resulting solution approaches and studying the emerging behaviors of proposed solutions to inform future research directions in Cooperative AI.
Fri 6:35 a.m. - 6:45 a.m.
|
Welcome
(
Introduction
)
|
🔗 |
Fri 9:45 a.m. - 10:00 a.m.
|
Team Social
|
🔗 |
Author Information
Rakshit Trivedi (Massachusetts Institute of Technology)
Akbir Khan (University College London)
Jesse Clifton (Center on Long-Term Risk)
Lewis Hammond (University of Oxford / Cooperative AI Foundation)
**Acting Executive Director at Cooperative AI Foundation** **DPhil Candidate at University of Oxford** Interested broadly in safety in multi-agent systems, especially cooperation problems.
John Agapiou (Google DeepMind)
Edgar Dueñez-Guzman (Google DeepMind)
Jayd Matyas (DeepMind)
Dylan Hadfield-Menell (MIT)
Joel Leibo (DeepMind)
More from the Same Authors
-
2021 : Hidden Agenda: a Social Deduction Game with Diverse Learned Equilibria »
Kavya Kopparapu · Edgar Dueñez-Guzman · Jayd Matyas · Alexander Vezhnevets · John Agapiou · Kevin McKee · Richard Everett · Janusz Marecki · Joel Leibo · Thore Graepel -
2022 : How to talk so AI will learn: instructions, descriptions, and pragmatics »
Theodore Sumers · Robert Hawkins · Mark Ho · Tom Griffiths · Dylan Hadfield-Menell -
2022 : MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning »
Mikayel Samvelyan · Akbir Khan · Michael Dennis · Minqi Jiang · Jack Parker-Holder · Jakob Foerster · Roberta Raileanu · Tim Rocktäschel -
2022 : All’s Well That Ends Well: Avoiding Side Effects with Distance-Impact Penalties »
Charlie Griffin · Joar Skalse · Lewis Hammond · Alessandro Abate -
2022 : Diagnostics for Deep Neural Networks with Automated Copy/Paste Attacks »
Stephen Casper · Kaivalya Hariharan · Dylan Hadfield-Menell -
2023 : Defining and Mitigating Collusion in Multi-Agent Systems »
Jack Foxabbott · Sam Deverett · Kaspar Senft · Samuel Dower · Lewis Hammond -
2023 : Leading the Pack: N-player Opponent Shaping »
Alexandra Souly · Timon Willi · Akbir Khan · Robert Kirk · Chris Lu · Edward Grefenstette · Tim Rocktäschel -
2023 : Defining and Mitigating Collusion in Multi-Agent Systems »
Jack Foxabbott · Sam Deverett · Kaspar Senft · Samuel Dower · Lewis Hammond -
2023 : Leading the Pack: N-player Opponent Shaping »
Alexandra Souly · Timon Willi · Akbir Khan · Robert Kirk · Chris Lu · Edward Grefenstette · Tim Rocktäschel -
2023 : Doing the right thing for the right reason: Evaluating artificial moral cognition by probing cost insensitivity »
Yiran Mao · Madeline G. Reinecke · Markus Kunesch · Edgar Duéñez-Guzmán · Ramona Comanescu · Julia Haas · Joel Leibo -
2023 : The Long-Term Effects of Personalization: Evidence from Youtube »
Andreas Haupt · Mihaela Curmei · François-Marie de Jouvencel · Marc Faddoul · Benjamin Recht · Dylan Hadfield-Menell -
2023 : Welfare Diplomacy: Benchmarking Language Model Cooperation »
Gabe Mukobi · Hannah Erlebach · Niklas Lauffer · Lewis Hammond · Alan Chan · Jesse Clifton -
2023 : Understanding Hidden Context in Preference Learning: Consequences for RLHF »
Anand Siththaranajn · Cassidy Laidlaw · Dylan Hadfield-Menell -
2023 : JaxMARL: Multi-Agent RL Environments in JAX »
Alexander Rutherford · Benjamin Ellis · Matteo Gallici · Jonathan Cook · Andrei Lupu · Garðar Ingvarsson · Timon Willi · Akbir Khan · Christian Schroeder de Witt · Alexandra Souly · Saptarashmi Bandyopadhyay · Mikayel Samvelyan · Minqi Jiang · Robert Lange · Shimon Whiteson · Bruno Lacerda · Nick Hawes · Tim Rocktäschel · Chris Lu · Jakob Foerster -
2023 : Understanding Hidden Context in Preference Learning: Consequences for RLHF »
Anand Siththaranajn · Cassidy Laidlaw · Dylan Hadfield-Menell -
2023 : TBA »
Lewis Hammond -
2023 : Mitigating Generative Agent Social Dilemmas »
Julian Yocum · Phillip Christoffersen · Mehul Damani · Justin Svegliato · Dylan Hadfield-Menell · Stuart J Russell -
2023 : Mitigating Generative Agent Social Dilemmas »
Julian Yocum · Phillip Christoffersen · Mehul Damani · Justin Svegliato · Dylan Hadfield-Menell · Stuart J Russell -
2023 Poster: Red Teaming Deep Neural Networks with Feature Synthesis Tools »
Stephen Casper · Tong Bu · Yuxiao Li · Jiawei Li · Kevin Zhang · Kaivalya Hariharan · Dylan Hadfield-Menell -
2023 Poster: The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs »
Laura Ruis · Akbir Khan · Stella Biderman · Sara Hooker · Tim Rocktäschel · Edward Grefenstette -
2022 Poster: Robust Feature-Level Adversaries are Interpretability Tools »
Stephen Casper · Max Nadeau · Dylan Hadfield-Menell · Gabriel Kreiman -
2022 Poster: How to talk so AI will learn: Instructions, descriptions, and autonomy »
Theodore Sumers · Robert Hawkins · Mark Ho · Tom Griffiths · Dylan Hadfield-Menell -
2019 Poster: Generalization of Reinforcement Learners with Working and Episodic Memory »
Meire Fortunato · Melissa Tan · Ryan Faulkner · Steven Hansen · Adrià Puigdomènech Badia · Gavin Buttimore · Charles Deck · Joel Leibo · Charles Blundell -
2019 Poster: Interval timing in deep reinforcement learning agents »
Ben Deverett · Ryan Faulkner · Meire Fortunato · Gregory Wayne · Joel Leibo -
2018 Poster: Inequity aversion improves cooperation in intertemporal social dilemmas »
Edward Hughes · Joel Leibo · Matthew Phillips · Karl Tuyls · Edgar Dueñez-Guzman · Antonio García Castañeda · Iain Dunning · Tina Zhu · Kevin McKee · Raphael Koster · Heather Roff · Thore Graepel -
2017 Poster: A multi-agent reinforcement learning model of common-pool resource appropriation »
Julien Pérolat · Joel Leibo · Vinicius Zambaldi · Charles Beattie · Karl Tuyls · Thore Graepel -
2016 Poster: Using Fast Weights to Attend to the Recent Past »
Jimmy Ba · Geoffrey E Hinton · Volodymyr Mnih · Joel Leibo · Catalin Ionescu -
2016 Oral: Using Fast Weights to Attend to the Recent Past »
Jimmy Ba · Geoffrey E Hinton · Volodymyr Mnih · Joel Leibo · Catalin Ionescu -
2016 Poster: Strategic Attentive Writer for Learning Macro-Actions »
Alexander (Sasha) Vezhnevets · Volodymyr Mnih · Simon Osindero · Alex Graves · Oriol Vinyals · John Agapiou · koray kavukcuoglu