Timezone: »
As machine learning is applied more to real-world problems like robotics, control of autonomous vehicles, drones, and recommendation systems, it becomes essential to consider the notion of agency where multiple agents with local observations start impacting each other and interact to achieve their goals. Multi-agent reinforcement learning (MARL) is concerned with developing learning algorithms that can discover effective policies in multi-agent environments. In this work, we develop algorithms for addressing two critical challenges in MARL - non-stationarity and robustness. We show that naive independent reinforcement learning does not preserve the strategic game-theoretic interaction between the agents, and we present a way to realize the classical infinite order recursion reasoning in a reinforcement learning setting. We refer to this framework as Interactive Policy Optimization (IPO) and derive four MARL algorithms using centralized-training-decentralized-execution that generalize the widely used single-agent policy gradient methods to multi-agent settings. Finally, we provide a method to estimate opponent's parameters in adversarial settings using maximum likelihood and integrate IPO with an adversarial learning framework to train agents robust to destabilizing disturbances from the environment/adversaries and for better sim2real transfer from simulated multi-agent environments to the real world.
Author Information
Videh Nema (National Institute of Technology Karnataka, Surathkal)
Balaraman Ravindran (Indian Institute of Technology Madras)
More from the Same Authors
-
2021 : Deep RePReL--Combining Planning and Deep RL for acting in relational domains »
Harsha Kokel · Arjun Manoharan · Sriraam Natarajan · Balaraman Ravindran · Prasad Tadepalli -
2021 : Interactive Robust Policy Optimization for Multi-Agent Reinforcement Learning »
Videh Nema · Balaraman Ravindran -
2021 : Interactive Robust Policy Optimization for Multi-Agent Reinforcement Learning »
Videh Nema · Balaraman Ravindran -
2022 : Guiding Offline Reinforcement Learning Using a Safety Expert »
Richa Verma · Kartik Bharadwaj · Harshad Khadilkar · Balaraman Ravindran -
2022 : Lagrangian Model Based Reinforcement Learning »
Adithya Ramesh · Balaraman Ravindran -
2021 : Matching options to tasks using Option-Indexed Hierarchical Reinforcement Learning »
Kushal Chauhan · Soumya Chatterjee · Pradeep Shenoy · Balaraman Ravindran -
2019 : Coffee Break & Poster Session 2 »
Juho Lee · Yoonho Lee · Yee Whye Teh · Raymond A. Yeh · Yuan-Ting Hu · Alex Schwing · Sara Ahmadian · Alessandro Epasto · Marina Knittel · Ravi Kumar · Mohammad Mahdian · Christian Bueno · Aditya Sanghi · Pradeep Kumar Jayaraman · Ignacio Arroyo-Fernández · Andrew Hryniowski · Vinayak Mathur · Sanjay Singh · Shahrzad Haddadan · Vasco Portilheiro · Luna Zhang · Mert Yuksekgonul · Jhosimar Arias Figueroa · Deepak Maurya · Balaraman Ravindran · Frank NIELSEN · Philip Pham · Justin Payan · Andrew McCallum · Jinesh Mehta · Ke SUN -
2018 : Spotlights 2 »
Mausam · Ankit Anand · Parag Singla · Tarik Koc · Tim Klinger · Habibeh Naderi · Sungwon Lyu · Saeed Amizadeh · Kshitij Dwivedi · Songpeng Zu · Wei Feng · Balaraman Ravindran · Edouard Pineau · Abdulkadir Celikkanat · Deepak Venugopal -
2014 Poster: An Autoencoder Approach to Learning Bilingual Word Representations »
Sarath Chandar · Stanislas Lauly · Hugo Larochelle · Mitesh Khapra · Balaraman Ravindran · Vikas C Raykar · Amrita Saha