Timezone: »
We consider the problem of multiple agents sensing and acting in environments with the goal of maximising their shared utility. In these environments, agents must learn communication protocols in order to share information that is needed to solve the tasks. By embracing deep neural networks, we are able to demonstrate end-to-end learning of protocols in complex environments inspired by communication riddles and multi-agent computer vision problems with partial observability. We propose two approaches for learning in these domains: Reinforced Inter-Agent Learning (RIAL) and Differentiable Inter-Agent Learning (DIAL). The former uses deep Q-learning, while the latter exploits the fact that, during learning, agents can backpropagate error derivatives through (noisy) communication channels. Hence, this approach uses centralised learning but decentralised execution. Our experiments introduce new environments for studying the learning of communication protocols and present a set of engineering innovations that are essential for success in these domains.
Author Information
Jakob Foerster (University of Oxford)
Jakob Foerster received a CIFAR AI chair in 2019 and is starting as an Assistant Professor at the University of Toronto and the Vector Institute in the academic year 20/21. During his PhD at the University of Oxford, he helped bring deep multi-agent reinforcement learning to the forefront of AI research and interned at Google Brain, OpenAI, and DeepMind. He has since been working as a research scientist at Facebook AI Research in California, where he will continue advancing the field up to his move to Toronto. He was the lead organizer of the first Emergent Communication (EmeCom) workshop at NeurIPS in 2017, which he has helped organize ever since.
Yannis Assael (University of Oxford)
Nando de Freitas (University of Oxford)
Shimon Whiteson (University of Oxford)
More from the Same Authors
-
2021 Spotlight: Bayesian Bellman Operators »
Mattie Fellows · Kristian Hartikainen · Shimon Whiteson -
2021 : Grounding Aleatoric Uncertainty in Unsupervised Environment Design »
Minqi Jiang · Michael Dennis · Jack Parker-Holder · Andrei Lupu · Heinrich Kuttler · Edward Grefenstette · Tim Rocktäschel · Jakob Foerster -
2021 : No DICE: An Investigation of the Bias-Variance Tradeoff in Meta-Gradients »
Risto Vuorio · Jacob Beck · Greg Farquhar · Jakob Foerster · Shimon Whiteson -
2021 : That Escalated Quickly: Compounding Complexity by Editing Levels at the Frontier of Agent Capabilities »
Jack Parker-Holder · Minqi Jiang · Michael Dennis · Mikayel Samvelyan · Jakob Foerster · Edward Grefenstette · Tim Rocktäschel -
2021 : On the Practical Consistency of Meta-Reinforcement Learning Algorithms »
Zheng Xiong · Luisa Zintgraf · Jacob Beck · Risto Vuorio · Shimon Whiteson -
2021 : Model based multi-agent reinforcement learning with tensor decompositions »
Pascal van der Vaart · Anuj Mahajan · Shimon Whiteson -
2021 : Reinforcement Learning in Factored Action Spaces using Tensor Decompositions »
Anuj Mahajan · Mikayel Samvelyan · Lei Mao · Viktor Makoviichuk · Animesh Garg · Jean Kossaifi · Shimon Whiteson · Yuke Zhu · Anima Anandkumar -
2021 : A Fine-Tuning Approach to Belief State Modeling »
Samuel Sokota · Hengyuan Hu · David Wu · Jakob Foerster · Noam Brown -
2021 : Generalized Belief Learning in Multi-Agent Settings »
Darius Muglich · Luisa Zintgraf · Christian Schroeder de Witt · Shimon Whiteson · Jakob Foerster -
2022 : Adversarial Cheap Talk »
Chris Lu · Timon Willi · Alistair Letcher · Jakob Foerster -
2022 : Human-AI Coordination via Human-Regularized Search and Learning »
Hengyuan Hu · David Wu · Adam Lerer · Jakob Foerster · Noam Brown -
2022 : Adversarial Cheap Talk »
Chris Lu · Timon Willi · Alistair Letcher · Jakob Foerster -
2022 : MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning »
Mikayel Samvelyan · Akbir Khan · Michael Dennis · Minqi Jiang · Jack Parker-Holder · Jakob Foerster · Roberta Raileanu · Tim Rocktäschel -
2023 Poster: Recurrent Hypernetworks are Surprisingly SOTA in Meta-RL »
Jacob Beck · Risto Vuorio · Zheng Xiong · Shimon Whiteson -
2023 Poster: Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design »
Matthew T Jackson · Minqi Jiang · Jack Parker-Holder · Risto Vuorio · Chris Lu · Greg Farquhar · Shimon Whiteson · Jakob Foerster -
2023 Poster: Similarity-based cooperative equilibrium »
Caspar Oesterheld · Johannes Treutlein · Roger Grosse · Vincent Conitzer · Jakob Foerster -
2023 Poster: Structured State Space Models for In-Context Reinforcement Learning »
Chris Lu · Yannick Schroecker · Albert Gu · Emilio Parisotto · Jakob Foerster · Satinder Singh · Feryal Behbahani -
2023 Poster: SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning »
Benjamin Ellis · Jonathan Cook · Skander Moalla · Mikayel Samvelyan · Mingfei Sun · Anuj Mahajan · Jakob Foerster · Shimon Whiteson -
2023 Poster: The Waymo Open Sim Agents Challenge »
Nico Montali · John Lambert · Paul Mougin · Alex Kuefler · Nicholas Rhinehart · Michelle Li · Cole Gulino · Tristan Emrich · Zoey Yang · Shimon Whiteson · Brandyn White · Dragomir Anguelov -
2023 Workshop: Socially Responsible Language Modelling Research (SoLaR) »
Usman Anwar · David Krueger · Samuel Bowman · Jakob Foerster · Su Lin Blodgett · Roberta Raileanu · Alan Chan · Katherine Lee · Laura Ruis · Robert Kirk · Yawen Duan · Xin Chen · Kawin Ethayarajh -
2022 : Jakob Foerster »
Jakob Foerster -
2022 Poster: Proximal Learning With Opponent-Learning Awareness »
Stephen Zhao · Chris Lu · Roger Grosse · Jakob Foerster -
2022 Poster: Nocturne: a scalable driving benchmark for bringing multi-agent learning one step closer to the real world »
Eugene Vinitsky · Nathan Lichtlé · Xiaomeng Yang · Brandon Amos · Jakob Foerster -
2022 Poster: In Defense of the Unitary Scalarization for Deep Multi-Task Learning »
Vitaly Kurin · Alessandro De Palma · Ilya Kostrikov · Shimon Whiteson · Pawan K Mudigonda -
2022 Poster: Grounding Aleatoric Uncertainty for Unsupervised Environment Design »
Minqi Jiang · Michael Dennis · Jack Parker-Holder · Andrei Lupu · Heinrich Küttler · Edward Grefenstette · Tim Rocktäschel · Jakob Foerster -
2022 Poster: Off-Team Learning »
Brandon Cui · Hengyuan Hu · Andrei Lupu · Samuel Sokota · Jakob Foerster -
2022 Poster: Self-Explaining Deviations for Coordination »
Hengyuan Hu · Samuel Sokota · David Wu · Anton Bakhtin · Andrei Lupu · Brandon Cui · Jakob Foerster -
2022 Poster: Truncated Emphatic Temporal Difference Methods for Prediction and Control »
Shangtong Zhang · Shimon Whiteson -
2022 Poster: Discovered Policy Optimisation »
Chris Lu · Jakub Kuba · Alistair Letcher · Luke Metz · Christian Schroeder de Witt · Jakob Foerster -
2022 Poster: Influencing Long-Term Behavior in Multiagent Reinforcement Learning »
Dong-Ki Kim · Matthew Riemer · Miao Liu · Jakob Foerster · Michael Everett · Chuangchuang Sun · Gerald Tesauro · Jonathan How -
2022 Poster: Equivariant Networks for Zero-Shot Coordination »
Darius Muglich · Christian Schroeder de Witt · Elise van der Pol · Shimon Whiteson · Jakob Foerster -
2021 : Reinforcement Learning in Factored Action Spaces using Tensor Decompositions »
Anuj Mahajan · Mikayel Samvelyan · Lei Mao · Viktor Makoviichuk · Animesh Garg · Jean Kossaifi · Shimon Whiteson · Yuke Zhu · Anima Anandkumar -
2021 : Model based multi-agent reinforcement learning with tensor decompositions »
Pascal van der Vaart · Anuj Mahajan · Shimon Whiteson -
2021 Workshop: Cooperative AI »
Natasha Jaques · Edward Hughes · Jakob Foerster · Noam Brown · Kalesha Bullard · Charlotte Smith -
2021 Poster: FACMAC: Factored Multi-Agent Centralised Policy Gradients »
Bei Peng · Tabish Rashid · Christian Schroeder de Witt · Pierre-Alexandre Kamienny · Philip Torr · Wendelin Boehmer · Shimon Whiteson -
2021 Poster: Replay-Guided Adversarial Environment Design »
Minqi Jiang · Michael Dennis · Jack Parker-Holder · Jakob Foerster · Edward Grefenstette · Tim Rocktäschel -
2021 Poster: Bayesian Bellman Operators »
Mattie Fellows · Kristian Hartikainen · Shimon Whiteson -
2021 Poster: Regularized Softmax Deep Multi-Agent Q-Learning »
Ling Pan · Tabish Rashid · Bei Peng · Longbo Huang · Shimon Whiteson -
2021 Poster: K-level Reasoning for Zero-Shot Coordination in Hanabi »
Brandon Cui · Hengyuan Hu · Luis Pineda · Jakob Foerster -
2021 Poster: Neural Pseudo-Label Optimism for the Bank Loan Problem »
Aldo Pacchiano · Shaun Singh · Edward Chou · Alex Berg · Jakob Foerster -
2021 Poster: Snowflake: Scaling GNNs to high-dimensional continuous control via parameter freezing »
Charles Blake · Vitaly Kurin · Maximilian Igl · Shimon Whiteson -
2020 Workshop: Talking to Strangers: Zero-Shot Emergent Communication »
Marie Ossenkopf · Angelos Filos · Abhinav Gupta · Michael Noukhovitch · Angeliki Lazaridou · Jakob Foerster · Kalesha Bullard · Rahma Chaabouni · Eugene Kharitonov · Roberto Dessì -
2020 Poster: Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian »
Jack Parker-Holder · Luke Metz · Cinjon Resnick · Hengyuan Hu · Adam Lerer · Alistair Letcher · Alexander Peysakhovich · Aldo Pacchiano · Jakob Foerster -
2020 Poster: Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning »
Tabish Rashid · Gregory Farquhar · Bei Peng · Shimon Whiteson -
2020 Poster: Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver? »
Vitaly Kurin · Saad Godil · Shimon Whiteson · Bryan Catanzaro -
2020 Poster: Learning Retrospective Knowledge with Reverse Reinforcement Learning »
Shangtong Zhang · Vivek Veeriah · Shimon Whiteson -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Poster Presentations »
Rahul Mehta · Andrew Lampinen · Binghong Chen · Sergio Pascual-Diaz · Jordi Grau-Moya · Aldo Faisal · Jonathan Tompson · Yiren Lu · Khimya Khetarpal · Martin Klissarov · Pierre-Luc Bacon · Doina Precup · Thanard Kurutach · Aviv Tamar · Pieter Abbeel · Jinke He · Maximilian Igl · Shimon Whiteson · Wendelin Boehmer · Raphaël Marinier · Olivier Pietquin · Karol Hausman · Sergey Levine · Chelsea Finn · Tianhe Yu · Lisa Lee · Benjamin Eysenbach · Emilio Parisotto · Eric Xing · Ruslan Salakhutdinov · Hongyu Ren · Anima Anandkumar · Deepak Pathak · Christopher Lu · Trevor Darrell · Alexei Efros · Phillip Isola · Feng Liu · Bo Han · Gang Niu · Masashi Sugiyama · Saurabh Kumar · Janith Petangoda · Johan Ferret · James McClelland · Kara Liu · Animesh Garg · Robert Lange -
2019 : Bayes-Adaptive Deep Reinforcement Learning via Meta-Learning - Invited Talk »
Shimon Whiteson -
2019 Workshop: Emergent Communication: Towards Natural Language »
Abhinav Gupta · Michael Noukhovitch · Cinjon Resnick · Natasha Jaques · Angelos Filos · Marie Ossenkopf · Angeliki Lazaridou · Jakob Foerster · Ryan Lowe · Douwe Kiela · Kyunghyun Cho -
2019 Poster: MAVEN: Multi-Agent Variational Exploration »
Anuj Mahajan · Tabish Rashid · Mikayel Samvelyan · Shimon Whiteson -
2019 Poster: Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Gradient Estimators for Reinforcement Learning »
Gregory Farquhar · Shimon Whiteson · Jakob Foerster -
2019 Poster: Multi-Agent Common Knowledge Reinforcement Learning »
Christian Schroeder de Witt · Jakob Foerster · Gregory Farquhar · Philip Torr · Wendelin Boehmer · Shimon Whiteson -
2019 Poster: DAC: The Double Actor-Critic Architecture for Learning Options »
Shangtong Zhang · Shimon Whiteson -
2019 Poster: Fast Efficient Hyperparameter Tuning for Policy Gradient Methods »
Supratik Paul · Vitaly Kurin · Shimon Whiteson -
2019 Poster: VIREL: A Variational Inference Framework for Reinforcement Learning »
Mattie Fellows · Anuj Mahajan · Tim G. J. Rudner · Shimon Whiteson -
2019 Spotlight: VIREL: A Variational Inference Framework for Reinforcement Learning »
Mattie Fellows · Anuj Mahajan · Tim G. J. Rudner · Shimon Whiteson -
2019 Poster: Generalized Off-Policy Actor-Critic »
Shangtong Zhang · Wendelin Boehmer · Shimon Whiteson -
2018 Workshop: Emergent Communication Workshop »
Jakob Foerster · Angeliki Lazaridou · Ryan Lowe · Igor Mordatch · Douwe Kiela · Kyunghyun Cho -
2017 Workshop: Emergent Communication Workshop »
Jakob Foerster · Igor Mordatch · Angeliki Lazaridou · Kyunghyun Cho · Douwe Kiela · Pieter Abbeel -
2017 Poster: Dynamic-Depth Context Tree Weighting »
Joao V Messias · Shimon Whiteson -
2017 Poster: Cortical microcircuits as gated-recurrent neural networks »
Rui Costa · Yannis Assael · Brendan Shillingford · Nando de Freitas · TIm Vogels -
2016 : Learning to Communicate with Deep Multi−Agent Reinforcement Learning »
Shimon Whiteson -
2015 Poster: Copeland Dueling Bandits »
Masrour Zoghi · Zohar Karnin · Shimon Whiteson · Maarten de Rijke -
2014 Poster: Distributed Parameter Estimation in Probabilistic Graphical Models »
Yariv D Mizrahi · Misha Denil · Nando de Freitas -
2013 Workshop: Bayesian Optimization in Theory and Practice »
Matthew Hoffman · Jasper Snoek · Nando de Freitas · Michael A Osborne · Ryan Adams · Sebastien Bubeck · Philipp Hennig · Remi Munos · Andreas Krause -
2013 Workshop: Deep Learning »
Yoshua Bengio · Hugo Larochelle · Russ Salakhutdinov · Tomas Mikolov · Matthew D Zeiler · David Mcallester · Nando de Freitas · Josh Tenenbaum · Jian Zhou · Volodymyr Mnih -
2011 Workshop: Bayesian optimization, experimental design and bandits: Theory and applications »
Nando de Freitas · Roman Garnett · Frank R Hutter · Michael A Osborne -
2010 Session: Spotlights Session 10 »
Nando de Freitas -
2010 Session: Oral Session 12 »
Nando de Freitas -
2009 Workshop: Adaptive Sensing, Active Learning, and Experimental Design »
Rui M Castro · Nando de Freitas · Ruben Martinez-Cantin -
2009 Tutorial: Sequential Monte-Carlo Methods »
Arnaud Doucet · Nando de Freitas -
2008 Poster: An interior-point stochastic approximation method and an L1-regularized delta rule »
Peter Carbonetto · Mark Schmidt · Nando de Freitas -
2008 Oral: An interior-point stochastic approximation method and an L1-regularized delta rule »
Peter Carbonetto · Mark Schmidt · Nando de Freitas -
2008 Demonstration: Worio: A Web-Scale Machine Learning System »
Nando de Freitas · Ali Davar -
2007 Spotlight: Bayesian Policy Learning with Trans-Dimensional MCMC »
Matthew Hoffman · Arnaud Doucet · Nando de Freitas · Ajay Jasra -
2007 Poster: Bayesian Policy Learning with Trans-Dimensional MCMC »
Matthew Hoffman · Arnaud Doucet · Nando de Freitas · Ajay Jasra -
2007 Poster: Active Preference Learning with Discrete Choice Data »
Eric Brochu · Nando de Freitas · Abhijeet Ghosh -
2006 Poster: Conditional mean field »
Peter Carbonetto · Nando de Freitas