Timezone: »
Trajectory optimization using a learned model of the environment is one of the core elements of model-based reinforcement learning. This procedure often suffers from exploiting inaccuracies of the learned model. We propose to regularize trajectory optimization by means of a denoising autoencoder that is trained on the same trajectories as the model of the environment. We show that the proposed regularization leads to improved planning with both gradient-based and gradient-free optimizers. We also demonstrate that using regularized trajectory optimization leads to rapid initial learning in a set of popular motor control tasks, which suggests that the proposed approach can be a useful tool for improving sample efficiency.
Author Information
Rinu Boney (Aalto University)
Norman Di Palo (-)
Mathias Berglund (Curious AI)
Alexander Ilin (Aalto University)
Juho Kannala (Aalto University)
Antti Rasmus (The Curious AI Company)
Harri Valpola (Curious AI)
More from the Same Authors
-
2021 : Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning »
Yi Zhao · Rinu Boney · Alexander Ilin · Juho Kannala · Joni Pajarinen -
2022 : Meta-learning from demonstrations improves compositional generalization »
Sam Spilsbury · Alexander Ilin -
2022 : Learning Explicit Object-Centric Representations with Vision Transformers »
Oscar Vikström · Alexander Ilin -
2020 Poster: Deep Automodulators »
Ari Heljakka · Yuxin Hou · Juho Kannala · Arno Solin -
2018 : TBC 5 »
Harri Valpola -
2017 Poster: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results »
Antti Tarvainen · Harri Valpola -
2017 Poster: Recurrent Ladder Networks »
Isabeau Prémont-Schwarz · Alexander Ilin · Tele Hao · Antti Rasmus · Rinu Boney · Harri Valpola -
2017 Spotlight: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results »
Antti Tarvainen · Harri Valpola -
2016 Poster: Tagger: Deep Unsupervised Perceptual Grouping »
Klaus Greff · Antti Rasmus · Mathias Berglund · Hotloo Xiranood · Harri Valpola · Jürgen Schmidhuber -
2015 Poster: Semi-supervised Learning with Ladder Networks »
Antti Rasmus · Mathias Berglund · Mikko Honkala · Harri Valpola · Tapani Raiko