Timezone: »
Poster
Fast deep reinforcement learning using online adjustments from the past
Steven Hansen · Alexander Pritzel · Pablo Sprechmann · Andre Barreto · Charles Blundell
We propose Ephemeral Value Adjusments (EVA): a means of allowing deep reinforcement learning agents to rapidly adapt to experience in their replay buffer. EVA shifts the value predicted by a neural network with an estimate of the value function found by prioritised sweeping over experience tuples from the replay buffer near the current state. EVA combines a number of recent ideas around combining episodic memory-like structures into reinforcement learning agents: slot-based storage, content-based retrieval, and memory-based planning. We show that EVA is performant on a demonstration task and Atari games.
Author Information
Steven Hansen (DeepMind)
Alexander Pritzel (Deepmind)
Pablo Sprechmann (DeepMind)
Andre Barreto (DeepMind)
Charles Blundell (DeepMind)
More from the Same Authors
-
2021 Spotlight: Proper Value Equivalence »
Christopher Grimm · Andre Barreto · Greg Farquhar · David Silver · Satinder Singh -
2021 : Wasserstein Distance Maximizing Intrinsic Control »
Ishan Durugkar · Steven Hansen · Stephen Spencer · Volodymyr Mnih · Ishan Durugkar -
2022 : In-context Reinforcement Learning with Algorithm Distillation »
Michael Laskin · Luyu Wang · Junhyuk Oh · Emilio Parisotto · Stephen Spencer · Richie Steigerwald · DJ Strouse · Steven Hansen · Angelos Filos · Ethan Brooks · Maxime Gazeau · Himanshu Sahni · Satinder Singh · Volodymyr Mnih -
2022 : In-context Reinforcement Learning with Algorithm Distillation »
Michael Laskin · Luyu Wang · Junhyuk Oh · Emilio Parisotto · Stephen Spencer · Richie Steigerwald · DJ Strouse · Steven Hansen · Angelos Filos · Ethan Brooks · Maxime Gazeau · Himanshu Sahni · Satinder Singh · Volodymyr Mnih -
2022 Poster: Approximate Value Equivalence »
Christopher Grimm · Andre Barreto · Satinder Singh -
2022 Poster: The Phenomenon of Policy Churn »
Tom Schaul · Andre Barreto · John Quan · Georg Ostrovski -
2021 Poster: Entropic Desired Dynamics for Intrinsic Control »
Steven Hansen · Guillaume Desjardins · Kate Baumli · David Warde-Farley · Nicolas Heess · Simon Osindero · Volodymyr Mnih -
2021 Poster: Risk-Aware Transfer in Reinforcement Learning using Successor Features »
Michael Gimelfarb · Andre Barreto · Scott Sanner · Chi-Guhn Lee -
2021 Poster: Proper Value Equivalence »
Christopher Grimm · Andre Barreto · Greg Farquhar · David Silver · Satinder Singh -
2021 Poster: Neural Production Systems »
Anirudh Goyal · Aniket Didolkar · Nan Rosemary Ke · Charles Blundell · Philippe Beaudoin · Nicolas Heess · Michael Mozer · Yoshua Bengio -
2020 Poster: Pointer Graph Networks »
Petar Veličković · Lars Buesing · Matthew Overlan · Razvan Pascanu · Oriol Vinyals · Charles Blundell -
2020 Spotlight: Pointer Graph Networks »
Petar Veličković · Lars Buesing · Matthew Overlan · Razvan Pascanu · Oriol Vinyals · Charles Blundell -
2020 Poster: On Efficiency in Hierarchical Reinforcement Learning »
Zheng Wen · Doina Precup · Morteza Ibrahimi · Andre Barreto · Benjamin Van Roy · Satinder Singh -
2020 Poster: The Value Equivalence Principle for Model-Based Reinforcement Learning »
Christopher Grimm · Andre Barreto · Satinder Singh · David Silver -
2020 Spotlight: On Efficiency in Hierarchical Reinforcement Learning »
Zheng Wen · Doina Precup · Morteza Ibrahimi · Andre Barreto · Benjamin Van Roy · Satinder Singh -
2019 Poster: Generalization of Reinforcement Learners with Working and Episodic Memory »
Meire Fortunato · Melissa Tan · Ryan Faulkner · Steven Hansen · Adrià Puigdomènech Badia · Gavin Buttimore · Charles Deck · Joel Leibo · Charles Blundell -
2019 Poster: Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates »
Carlos Riquelme · Hugo Penedones · Damien Vincent · Hartmut Maennel · Sylvain Gelly · Timothy A Mann · Andre Barreto · Gergely Neu -
2019 Demonstration: The Option Keyboard: Combining Skills in Reinforcement Learning »
Daniel Toyama · Shaobo Hou · Gheorghe Comanici · Andre Barreto · Doina Precup · Shibl Mourad · Eser Aygün · Philippe Hamel -
2019 Poster: The Option Keyboard: Combining Skills in Reinforcement Learning »
Andre Barreto · Diana Borsa · Shaobo Hou · Gheorghe Comanici · Eser Aygün · Philippe Hamel · Daniel Toyama · jonathan j hunt · Shibl Mourad · David Silver · Doina Precup -
2017 Poster: Natural Value Approximators: Learning when to Trust Past Estimates »
Zhongwen Xu · Joseph Modayil · Hado van Hasselt · Andre Barreto · David Silver · Tom Schaul -
2017 Poster: Successor Features for Transfer in Reinforcement Learning »
Andre Barreto · Will Dabney · Remi Munos · Jonathan Hunt · Tom Schaul · David Silver · Hado van Hasselt -
2017 Poster: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles »
Balaji Lakshminarayanan · Alexander Pritzel · Charles Blundell -
2017 Spotlight: Successor Features for Transfer in Reinforcement Learning »
Andre Barreto · Will Dabney · Remi Munos · Jonathan Hunt · Tom Schaul · David Silver · Hado van Hasselt -
2017 Spotlight: Natural Value Approximators: Learning when to Trust Past Estimates »
Zhongwen Xu · Joseph Modayil · Hado van Hasselt · Andre Barreto · David Silver · Tom Schaul -
2017 Spotlight: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles »
Balaji Lakshminarayanan · Alexander Pritzel · Charles Blundell -
2016 Poster: Matching Networks for One Shot Learning »
Oriol Vinyals · Charles Blundell · Timothy Lillicrap · koray kavukcuoglu · Daan Wierstra -
2016 Poster: Deep Exploration via Bootstrapped DQN »
Ian Osband · Charles Blundell · Alexander Pritzel · Benjamin Van Roy -
2014 Workshop: Advances in Variational Inference »
David Blei · Shakir Mohamed · Michael Jordan · Charles Blundell · Tamara Broderick · Matthew D. Hoffman -
2013 Poster: Bayesian Hierarchical Community Discovery »
Charles Blundell · Yee Whye Teh -
2012 Poster: Modelling Reciprocating Relationships with Hawkes processes »
Charles Blundell · Katherine Heller · Jeff Beck -
2012 Poster: On-line Reinforcement Learning Using Incremental Kernel-Based Stochastic Factorization »
Andre S Barreto · Doina Precup · Joelle Pineau -
2012 Spotlight: Modelling Reciprocating Relationships with Hawkes processes »
Charles Blundell · Katherine Heller · Jeff Beck -
2011 Poster: Modelling Genetic Variations using Fragmentation-Coagulation Processes »
Yee Whye Teh · Charles Blundell · Lloyd T Elliott -
2011 Oral: Modelling Genetic Variations using Fragmentation-Coagulation Processes »
Yee Whye Teh · Charles Blundell · Lloyd T Elliott -
2011 Poster: Reinforcement Learning using Kernel-Based Stochastic Factorization »
Andre S Barreto · Doina Precup · Joelle Pineau