Timezone: »
Building systems that autonomously create temporal abstractions from data is a key challenge in scaling learning and planning in reinforcement learning. One popular approach for addressing this challenge is the options framework (Sutton et al., 1999). However, only recently in (Bacon et al., 2017) was a policy gradient theorem derived for online learning of general purpose options in an end to end fashion. In this work, we extend previous work on this topic that only focuses on learning a two-level hierarchy including options and primitive actions to enable learning simultaneously at multiple resolutions in time. We achieve this by considering an arbitrarily deep hierarchy of options where high level temporally extended options are composed of lower level options with finer resolutions in time. We extend results from (Bacon et al., 2017) and derive policy gradient theorems for a deep hierarchy of options. Our proposed hierarchical option-critic architecture is capable of learning internal policies, termination conditions, and hierarchical compositions over options without the need for any intrinsic rewards or subgoals. Our empirical results in both discrete and continuous environments demonstrate the efficiency of our framework.
Author Information
Matthew Riemer (IBM Research AI)
Miao Liu (IBM)
Gerald Tesauro (IBM TJ Watson Research Center)
More from the Same Authors
-
2022 : Learning in Factored Domains with Information-Constrained Visual Representations »
Tyler Malloy · Chris Sims · Tim Klinger · Matthew Riemer · Miao Liu · Gerald Tesauro -
2023 Poster: On the Convergence and Sample Complexity Analysis of Deep Q-Networks with $\epsilon$-Greedy Exploration »
Shuai Zhang · Meng Wang · Hongkang Li · Miao Liu · Pin-Yu Chen · Songtao Lu · Sijia Liu · Keerthiram Murugesan · Subhajit Chaudhury -
2022 Poster: Continual Learning In Environments With Polynomial Mixing Times »
Matthew Riemer · Sharath Chandra Raparthy · Ignacio Cases · Gopeshh Subbaraj · Maximilian Puelma Touzel · Irina Rish -
2022 Poster: Influencing Long-Term Behavior in Multiagent Reinforcement Learning »
Dong-Ki Kim · Matthew Riemer · Miao Liu · Jakob Foerster · Michael Everett · Chuangchuang Sun · Gerald Tesauro · Jonathan How -
2021 : Continual Learning In Environments With Polynomial Mixing Times »
Matthew Riemer · Sharath Chandra Raparthy · Ignacio Cases · Gopeshh Subbaraj · Maximilian Puelma Touzel · Irina Rish -
2020 Poster: Decentralized TD Tracking with Linear Function Approximation and its Finite-Time Analysis »
Gang Wang · Songtao Lu · Georgios Giannakis · Gerald Tesauro · Jian Sun -
2018 Poster: Dialog-based Interactive Image Retrieval »
Xiaoxiao Guo · Hui Wu · Yu Cheng · Steven Rennie · Gerald Tesauro · Rogerio Feris -
2017 : Poster Session »
David Abel · Nicholas Denis · Maria Eckstein · Ronan Fruit · Karan Goel · Joshua Gruenstein · Anna Harutyunyan · Martin Klissarov · Xiangyu Kong · Aviral Kumar · Saurabh Kumar · Miao Liu · Daniel McNamee · Shayegan Omidshafiei · Silviu Pitis · Paulo Rauber · Melrose Roderick · Tianmin Shu · Yizhou Wang · Shangtong Zhang -
2017 : Spotlights & Poster Session »
David Abel · Nicholas Denis · Maria Eckstein · Ronan Fruit · Karan Goel · Joshua Gruenstein · Anna Harutyunyan · Martin Klissarov · Xiangyu Kong · Aviral Kumar · Saurabh Kumar · Miao Liu · Daniel McNamee · Shayegan Omidshafiei · Silviu Pitis · Paulo Rauber · Melrose Roderick · Tianmin Shu · Yizhou Wang · Shangtong Zhang -
2017 Workshop: Conversational AI - today's practice and tomorrow's potential »
Alborz Geramifard · Jason Williams · Larry Heck · Jim Glass · Antoine Bordes · Steve Young · Gerald Tesauro -
2015 : Deep RL in Games Research »
Gerald Tesauro -
2007 Spotlight: Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning »
Gerald Tesauro · Rajarshi Das · Hoi Chan · Jeffrey O Kephart · David Levine · Freeman Rawson · Charles Lefurgy -
2007 Poster: Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning »
Gerald Tesauro · Rajarshi Das · Hoi Chan · Jeffrey O Kephart · David Levine · Freeman Rawson · Charles Lefurgy