Timezone: »
The options framework in Hierarchical Reinforcement Learning breaks down overall goals into a combination of options or simpler tasks and associated policies, allowing for abstraction in the action space. Ideally, these options can be reused across different higher-level goals; indeed, many previous approaches have proposed limited forms of transfer of prelearned options to new task settings. We propose a novel option indexing approach to hierarchical learning (OI-HRL), where we learn an affinity function between options and the functionalities (or affordances) supported by the environment. This allows us to effectively reuse a large library of pretrained options, in zero-shot generalization at test time, by restricting goal-directed learning to only those options relevant to the task at hand. We develop a meta-training loop that learns the representations of options and environment affordances over a series of HRL problems, by incorporating feedback about the relevance of retrieved options to the higher-level goal. In addition to a substantial decrease in sample complexity compared to learning HRL policies from scratch, we also show significant gains over baselines that have the entire option pool available for learning the hierarchical policy.
Author Information
Kushal Chauhan (Google Research)
Soumya Chatterjee (Google Research)
Pradeep Shenoy (Google)
Balaraman Ravindran (Indian Institute of Technology Madras)
More from the Same Authors
-
2021 : Deep RePReL--Combining Planning and Deep RL for acting in relational domains »
Harsha Kokel · Arjun Manoharan · Sriraam Natarajan · Balaraman Ravindran · Prasad Tadepalli -
2021 : Interactive Robust Policy Optimization for Multi-Agent Reinforcement Learning »
Videh Nema · Balaraman Ravindran -
2021 : Interactive Robust Policy Optimization for Multi-Agent Reinforcement Learning »
Videh Nema · Balaraman Ravindran -
2021 : Robust outlier detection by de-biasing VAE likelihoods »
Kushal Chauhan · Pradeep Shenoy · Manish Gupta · Devarajan Sridharan -
2021 : Interactive Robust Policy Optimization for Multi-Agent Reinforcement Learning »
Videh Nema · Balaraman Ravindran -
2022 : Guiding Offline Reinforcement Learning Using a Safety Expert »
Richa Verma · Kartik Bharadwaj · Harshad Khadilkar · Balaraman Ravindran -
2022 : Lagrangian Model Based Reinforcement Learning »
Adithya Ramesh · Balaraman Ravindran -
2022 : Interactive Concept Bottleneck Models »
Kushal Chauhan · Rishabh Tiwari · Jan Freyberg · Pradeep Shenoy · Krishnamurthy Dvijotham -
2019 : Coffee Break & Poster Session 2 »
Juho Lee · Yoonho Lee · Yee Whye Teh · Raymond A. Yeh · Yuan-Ting Hu · Alex Schwing · Sara Ahmadian · Alessandro Epasto · Marina Knittel · Ravi Kumar · Mohammad Mahdian · Christian Bueno · Aditya Sanghi · Pradeep Kumar Jayaraman · Ignacio Arroyo-Fernández · Andrew Hryniowski · Vinayak Mathur · Sanjay Singh · Shahrzad Haddadan · Vasco Portilheiro · Luna Zhang · Mert Yuksekgonul · Jhosimar Arias Figueroa · Deepak Maurya · Balaraman Ravindran · Frank NIELSEN · Philip Pham · Justin Payan · Andrew McCallum · Jinesh Mehta · Ke SUN -
2018 : Spotlights 2 »
Mausam · Ankit Anand · Parag Singla · Tarik Koc · Tim Klinger · Habibeh Naderi · Sungwon Lyu · Saeed Amizadeh · Kshitij Dwivedi · Songpeng Zu · Wei Feng · Balaraman Ravindran · Edouard Pineau · Abdulkadir Celikkanat · Deepak Venugopal -
2014 Poster: An Autoencoder Approach to Learning Bilingual Word Representations »
Sarath Chandar · Stanislas Lauly · Hugo Larochelle · Mitesh Khapra · Balaraman Ravindran · Vikas C Raykar · Amrita Saha