Timezone: »
Dealing with uncertainty is essential for efficient reinforcement learning. There is a growing literature on uncertainty estimation for deep learning from fixed datasets, but many of the most popular approaches are poorly-suited to sequential decision problems. Other methods, such as bootstrap sampling, have no mechanism for uncertainty that does not come from the observed data. We highlight why this can be a crucial shortcoming and propose a simple remedy through addition of a randomized untrainable `prior' network to each ensemble member. We prove that this approach is efficient with linear representations, provide simple illustrations of its efficacy with nonlinear representations and show that this approach scales to large-scale problems far better than previous attempts.
Author Information
Ian Osband (Google Deepmind)
John Aslanides (DeepMind)
Albin Cassirer (DeepMind)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Poster: Randomized Prior Functions for Deep Reinforcement Learning »
Wed. Dec 5th through Thu the 6th Room Room 517 AB #154
More from the Same Authors
-
2022 Poster: Fine-tuning language models to find agreement among humans with diverse preferences »
Michiel Bakker · Martin Chadwick · Hannah Sheahan · Michael Tessler · Lucy Campbell-Gillingham · Jan Balaguer · Nat McAleese · Amelia Glaese · John Aslanides · Matt Botvinick · Christopher Summerfield -
2019 Poster: When to use parametric models in reinforcement learning? »
Hado van Hasselt · Matteo Hessel · John Aslanides -
2018 Poster: Scalable Coordinated Exploration in Concurrent Reinforcement Learning »
Maria Dimakopoulou · Ian Osband · Benjamin Van Roy