Timezone: »
We live in a multi-agent world and to be successful in that world, agents, and in particular, artificially intelligent agents, will need to learn to take into account the agency of others. They will need to compete in market places, cooperate in teams, communicate with others, coordinate their plans, and negotiate outcomes. Examples include self-driving cars interacting in traffic, personal assistants acting on behalf of humans and negotiating with other agents, swarms of unmanned aerial vehicles, financial trading systems, robotic teams, and household robots.
Furthermore, the evolution of human intelligence itself presumably depended on interaction among human agents, possibly starting out with confrontational scavenging [1] and culminating in the evolution of culture, societies, and language. Learning from other agents is a key feature of human intelligence and an important field of research in machine learning [2]. It is therefore conceivable that exposing learning AI agents to multi-agent situations is necessary for their development towards intelligence.
We can also think of multi-agent systems as a design philosophy for complex systems. We can analyse complex systems in terms of agents at multiple scales. For example, we can view the system of world politics as an interaction of nation state agents, nation states as an interaction of organizations, and further down into departments, people etc. Conversely, when designing systems we can think of agents as building blocks or modules interacting to produce the behaviour of the system, e.g. [3].
Multi-agent systems can have desirable properties such as robustness and scalability, but their design requires careful consideration of incentive structures, learning, and communication. In the most extreme case, agents with individual views of the world, individual actuators, and individual incentive structures need to coordinate to achieve a common goal. To succeed they may need a Theory of Mind that allows them to reason about other agents’ intentions, beliefs, and behaviours [4]. When multiple learning agents are interacting, the learning problem from each agent’s perspective may become non-stationary, non-Markovian, and only partially observable. Studying the dynamics of learning algorithms could lead to better insight about the evolution and stability of such systems [5].
Problems involving competing or cooperating agents feature in recent AI breakthroughs in competitive games [6,7], current ambitions of AI such as robotic football teams [8], and new research into emergent language and agent communication in reinforcement learning [9,10].
In summary, multi-agent learning will be of crucial importance to the future of computational intelligence and pose difficult and fascinating problems that need to be addressed across disciplines. The paradigm shift from single-agent to multi-agent systems will be pervasive and will require efforts across different fields including machine learning, cognitive science, robotics, natural computing, and (evolutionary) game theory. In this workshop we aim to bring together researchers from these different fields to discuss the current state of the art, future avenues and visions for work regarding theory and practice of multi-agent learning, inference, and decision-making.
Topics we consider for inclusion in the workshop include multi-agent reinforcement learning; deep multi-agent learning; theory of mind; multi-agent communication; POMDPs, Dec-POMDPS and partially observable stochastic games; multi-agent robotics, human-robot collaboration, swarm robotics; game theory, mechanism design, algorithms for computing nash equilibria and other solution concepts; bioinspired approaches, swarm intelligence and collective intelligence; co-evolution, evolutionary dynamics and culture; ad hoc teamwork.
[1] ‘Confrontational scavenging as a possible source for language and cooperation’, Derek Bickerton and Eörs Szathmáry, BMC Evolutionary Biology 2011
[2] ‘Apprenticeship Learning via Inverse Reinforcement Learning’, Pieter Abbeel and Andrew Y. Ng, ICML 2004
[3] ‘The Society of Mind’, Marvin Minsky, 1986
[4] ‘Building Machines That Learn and Think Like People’, Brenden M. Lake et al., CBMM Memo 2016
[5] ‘Evolutionary Dynamics of Multi-Agent Learning: A Survey’, Daan Bloembergen et al., JAIR 2015
[6] 'Mastering the game of Go with deep neural networks and tree search', David Silver et al., Nature 2016
[7] 'Heads-up limit hold’em poker is solved', Michael Bowling et al., Science 2015
[8] RoboCup, http://www.robocup.org/
[9] 'Learning to Communicate with Deep Multi-Agent Reinforcement Learning', Jakob N. Foerster et al., Arxiv 2016
[10] 'Learning Multiagent Communication with Backpropagation', Sainbayar Sukhbaatar et al. Arxiv 2016
Thu 11:30 p.m. - 11:50 p.m.
|
Introduction
(
Talk
)
|
Thore Graepel · Karl Tuyls · Frans Oliehoek 🔗 |
Thu 11:50 p.m. - 12:40 a.m.
|
Learning to Communicate with Deep Multi−Agent Reinforcement Learning
(
Talk
)
|
Shimon Whiteson 🔗 |
Fri 12:40 a.m. - 1:30 a.m.
|
Computer Curling: AI in Sports Analytics
(
Talk
)
|
Michael Bowling 🔗 |
Fri 2:00 a.m. - 2:50 a.m.
|
Reverse engineering human cooperation (or, How to build machines that treat people like people)
(
Talk
)
|
Josh Tenenbaum · Max Kleiman-Weiner 🔗 |
Fri 2:50 a.m. - 3:40 a.m.
|
Spotlight Session
(
Spotlight
)
|
🔗 |
Fri 3:40 a.m. - 4:30 a.m.
|
Lunch
|
🔗 |
Fri 4:30 a.m. - 5:10 a.m.
|
Poster Session
|
🔗 |
Fri 5:10 a.m. - 6:00 a.m.
|
Multi-Agent and Multi-Robot Coordination with Uncertainty and Limited Communication
(
Talk
)
|
🔗 |
Fri 6:00 a.m. - 6:30 a.m.
|
Coffee Break
|
🔗 |
Fri 6:30 a.m. - 6:50 a.m.
|
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving
(
Talk
)
|
🔗 |
Fri 6:50 a.m. - 7:10 a.m.
|
A Study of Value Iteration with Non-Stationary Strategies in General Sum Markov Games
(
Talk
)
|
Julien Pérolat 🔗 |
Fri 7:10 a.m. - 7:30 a.m.
|
Learning to Assemble Objects with Robot Swarms
(
Talk
)
|
Gerhard Neumann 🔗 |
Fri 7:30 a.m. - 7:50 a.m.
|
Break
|
🔗 |
Fri 7:50 a.m. - 8:40 a.m.
|
Challenges on the way to fully autonomous swarms of drones
(
Talk
)
While a single, small robot is limited in its capabilities to perform complex tasks, large groups or "swarms" of such robots have a much bigger potential. Physically, they can collaborate to move heavier things, cross gaps bigger than a single robot body length, or explore unknown areas much quicker. Mentally, they can take in and process much more information than a single robot could, even if communication is extremely limited. In the NIPS 2016 workshop on multi-agent systems, it is suggested that true Artificial Intelligence can only be reached by having robots interact with each other, and it is well-known that groups of robots potentially have a much larger collective learning potential than animals or humans. So, why are we not yet seeing many such robotic swarms in the real world or even in academia? In my talk I will go into the challenges of making an autonomous swarm of tiny drones explore an unknown building. These drones are < 50 grams and have to fly around, avoid obstacles, navigate, and work together for the most efficient exploration. I will highlight how complex these various challenges are and report on a specific study in which we have drones use their bluetooth modules to avoid each other, should they find themselves in the same small indoor space. This case study will illustrate what are in my eyes the major challenges towards the promised autonomous robotic swarms. |
Guido de Croon 🔗 |
Fri 8:40 a.m. - 9:20 a.m.
|
Discussion Panel
|
🔗 |
Fri 9:20 a.m. - 9:30 a.m.
|
Concluding Remarks
(
Talk
)
|
Thore Graepel · Frans Oliehoek · Karl Tuyls 🔗 |
Author Information
Thore Graepel (DeepMind)
Marc Lanctot (DeepMind)
Joel Leibo (Google DeepMind)
Guy Lever (UCL)
Janusz Marecki (DeepMind)
Frans Oliehoek (Delft University of Technology)
Karl Tuyls (University of Liverpool)
Vicky Holgate (DeepMind)
More from the Same Authors
-
2020 : Analog Circuit Design with Dyna-Style Reinforcement Learning »
Wook Lee · Frans Oliehoek -
2021 : Hidden Agenda: a Social Deduction Game with Diverse Learned Equilibria »
Kavya Kopparapu · Edgar Dueñez-Guzman · Jayd Matyas · Alexander Vezhnevets · John Agapiou · Kevin McKee · Richard Everett · Janusz Marecki · Joel Leibo · Thore Graepel -
2021 Poster: Dynamic population-based meta-learning for multi-agent communication with natural language »
Abhinav Gupta · Marc Lanctot · Angeliki Lazaridou -
2020 : Q&A: Open Problems in Cooperative AI with Thore Graepel (DeepMind), Allan Dafoe (University of Oxford), Yoram Bachrach (DeepMind), and Natasha Jaques (Google) [moderator] »
Thore Graepel · Yoram Bachrach · Allan Dafoe · Natasha Jaques -
2020 : Open Problems in Cooperative AI: Thore Graepel (DeepMind) and Allan Dafoe (University of Oxford) »
Thore Graepel · Allan Dafoe -
2020 Workshop: Cooperative AI »
Thore Graepel · Dario Amodei · Vincent Conitzer · Allan Dafoe · Gillian Hadfield · Eric Horvitz · Sarit Kraus · Kate Larson · Yoram Bachrach -
2020 Poster: Learning to Play No-Press Diplomacy with Best Response Policy Iteration »
Thomas Anthony · Tom Eccles · Andrea Tacchetti · János Kramár · Ian Gemp · Thomas Hudson · Nicolas Porcel · Marc Lanctot · Julien Perolat · Richard Everett · Satinder Singh · Thore Graepel · Yoram Bachrach -
2020 Spotlight: Learning to Play No-Press Diplomacy with Best Response Policy Iteration »
Thomas Anthony · Tom Eccles · Andrea Tacchetti · János Kramár · Ian Gemp · Thomas Hudson · Nicolas Porcel · Marc Lanctot · Julien Perolat · Richard Everett · Satinder Singh · Thore Graepel · Yoram Bachrach -
2019 Poster: Biases for Emergent Communication in Multi-agent Reinforcement Learning »
Tom Eccles · Yoram Bachrach · Guy Lever · Angeliki Lazaridou · Thore Graepel -
2018 Poster: Actor-Critic Policy Optimization in Partially Observable Multiagent Environments »
Sriram Srinivasan · Marc Lanctot · Vinicius Zambaldi · Julien Perolat · Karl Tuyls · Remi Munos · Michael Bowling -
2018 Poster: Inequity aversion improves cooperation in intertemporal social dilemmas »
Edward Hughes · Joel Leibo · Matthew Phillips · Karl Tuyls · Edgar Dueñez-Guzman · Antonio García Castañeda · Iain Dunning · Tina Zhu · Kevin McKee · Raphael Koster · Heather Roff · Thore Graepel -
2018 Poster: Re-evaluating evaluation »
David Balduzzi · Karl Tuyls · Julien Perolat · Thore Graepel -
2017 Poster: A multi-agent reinforcement learning model of common-pool resource appropriation »
Julien Pérolat · Joel Leibo · Vinicius Zambaldi · Charles Beattie · Karl Tuyls · Thore Graepel -
2017 Poster: A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning »
Marc Lanctot · Vinicius Zambaldi · Audrunas Gruslys · Angeliki Lazaridou · Karl Tuyls · Julien Perolat · David Silver · Thore Graepel -
2016 : Concluding Remarks »
Thore Graepel · Frans Oliehoek · Karl Tuyls -
2016 : Introduction »
Thore Graepel · Karl Tuyls · Frans Oliehoek -
2016 Poster: Memory-Efficient Backpropagation Through Time »
Audrunas Gruslys · Remi Munos · Ivo Danihelka · Marc Lanctot · Alex Graves -
2013 Poster: Learning invariant representations and applications to face verification »
Qianli Liao · Joel Leibo · Tomaso Poggio -
2012 Workshop: Multi-Trade-offs in Machine Learning »
Yevgeny Seldin · Guy Lever · John Shawe-Taylor · Nicolò Cesa-Bianchi · Yacov Crammer · Francois Laviolette · Gabor Lugosi · Peter Bartlett -
2011 Poster: Why The Brain Separates Face Recognition From Object Recognition »
Joel Leibo · Jim Mutch · Tomaso Poggio -
2008 Poster: On-Line Prediction on Large Diameter Graphs »
Mark Herbster · Massimiliano Pontil · Guy Lever