Timezone: »
Off-Policy Actor-Critic (OffP-AC) methods have proven successful in a variety of continuous control tasks. Normally, the critic's action-value function is updated using temporal-difference, and the critic in turn provides a loss for the actor that trains it to take actions with higher expected return. In this paper, we introduce a flexible and augmented meta-critic that observes the learning process and meta-learns an additional loss for the actor that accelerates and improves actor-critic learning. Compared to existing meta-learning algorithms, meta-critic is rapidly learned online for a single task, rather than slowly over a family of tasks. Crucially, our meta-critic is designed for off-policy based learners, which currently provide state-of-the-art reinforcement learning sample efficiency. We demonstrate that online meta-critic learning benefits to a variety of continuous control tasks when combined with contemporary OffP-AC methods DDPG, TD3 and SAC.
Author Information
Wei Zhou (National University of Defense Technology)
Yiying Li (National University of Defense Technology)
Yongxin Yang (University of Edinburgh )
Huaimin Wang (National University of Defense Technology)
Timothy Hospedales (University of Edinburgh)
More from the Same Authors
-
2021 : A Channel Coding Benchmark for Meta-Learning »
Rui Li · Ondrej Bohdal · Rajesh K Mishra · Hyeji Kim · Da Li · Nicholas Lane · Timothy Hospedales -
2022 : Enhanced Index Tracking via Differentiable Assets Sorting »
Yuanyuan Liu · Yongxin Yang -
2021 : Vision-based system identification and 3D keypoint discovery using dynamics constraints »
Miguel Jaques · Martin Asenov · Michael Burke · Timothy Hospedales -
2021 Poster: EvoGrad: Efficient Gradient-Based Meta-Learning and Hyperparameter Optimization »
Ondrej Bohdal · Yongxin Yang · Timothy Hospedales -
2020 : Q/A for invited talk #3 »
Timothy Hospedales -
2020 : Meta-Learning: Representations and Objectives »
Timothy Hospedales -
2019 Poster: What the Vec? Towards Probabilistically Grounded Embeddings »
Carl Allen · Ivana Balazevic · Timothy Hospedales -
2019 Poster: Multi-relational PoincarĂ© Graph Embeddings »
Ivana Balazevic · Carl Allen · Timothy Hospedales