Timezone: »
Exploration is essential in reinforcement learning, particularly in environments where external rewards are sparse. Here we focus on exploration with intrinsic rewards, where the agent transiently augments the external rewards with self-generated intrinsic rewards. Although the study of intrinsic rewards has a long history, existing methods focus on composing the intrinsic reward based on measures of future prospects of states, ignoring the information contained in the retrospective structure of transition sequences. Here we argue that the agent can utilise retrospective information to generate explorative behaviour with structure-awareness, facilitating efficient exploration based on global instead of local information. We propose Successor-Predecessor Intrinsic Exploration (SPIE), an exploration algorithm based on a novel intrinsic reward combining prospective and retrospective information. We show that SPIE yields more efficient and ethologically plausible exploratory behaviour in environments with sparse rewards and bottleneck states than competing methods. We also implement SPIE in deep reinforcement learning agents, and show that the resulting agent achieves stronger empirical performance than existing methods on sparse-reward Atari games.
Author Information
Changmin Yu (UCL)
Neil Burgess (University College London)
Maneesh Sahani (Gatsby Unit, UCL)
Samuel J Gershman (Harvard University)
More from the Same Authors
-
2021 : CCNLab: A Benchmarking Framework for Computational Cognitive Neuroscience »
Nikhil Bhattasali · Momchil Tomov · Samuel J Gershman -
2021 Spotlight: Probabilistic Tensor Decomposition of Neural Population Spiking Activity »
Hugo Soulat · Sepiedeh Keshavarzi · Troy Margrie · Maneesh Sahani -
2022 : Leveraging Episodic Memory to Improve World Models for Reinforcement Learning »
Julian Coda-Forno · Changmin Yu · Qinghai Guo · Zafeirios Fountas · Neil Burgess -
2022 : Constructing Memory: Consolidation as Teacher-Student Training of a Generative Model »
Eleanor Spens · Neil Burgess -
2023 : Stochastic linear dynamics in parameters to deal with Neural Networks plasticity loss »
Alexandre Galashov · Michalis Titsias · Razvan Pascanu · Yee Whye Teh · Maneesh Sahani -
2023 Poster: A State Representation for Diminishing Rewards »
Ted Moskovitz · Samo Hromadka · Ahmed Touati · Diana Borsa · Maneesh Sahani -
2022 : Closing Remarks »
Samuel J Gershman -
2022 Workshop: Information-Theoretic Principles in Cognitive Systems »
Noga Zaslavsky · Mycal Tucker · Sarah Marzen · Irina Higgins · Stephanie Palmer · Samuel J Gershman -
2022 : Panel Discussion: Opportunities and Challenges »
Kenneth Norman · Janice Chen · Samuel J Gershman · Albert Gu · Sepp Hochreiter · Ida Momennejad · Hava Siegelmann · Sainbayar Sukhbaatar -
2022 Poster: Structured Recognition for Generative Models with Explaining Away »
Changmin Yu · Hugo Soulat · Neil Burgess · Maneesh Sahani -
2021 Poster: Probabilistic Tensor Decomposition of Neural Population Spiking Activity »
Hugo Soulat · Sepiedeh Keshavarzi · Troy Margrie · Maneesh Sahani -
2020 Poster: Non-reversible Gaussian processes for identifying latent dynamical structure in neural data »
Virginia Rutten · Alberto Bernacchia · Maneesh Sahani · Guillaume Hennequin -
2020 Oral: Non-reversible Gaussian processes for identifying latent dynamical structure in neural data »
Virginia Rutten · Alberto Bernacchia · Maneesh Sahani · Guillaume Hennequin -
2020 Poster: Organizing recurrent network dynamics by task-computation to enable continual learning »
Lea Duncker · Laura N Driscoll · Krishna V Shenoy · Maneesh Sahani · David Sussillo -
2019 Poster: A neurally plausible model for online recognition and postdiction in a dynamical environment »
Kevin Li · Maneesh Sahani -
2019 Poster: A neurally plausible model learns successor representations in partially observable environments »
Eszter Vértes · Maneesh Sahani -
2019 Poster: Coordinated hippocampal-entorhinal replay as structural inference »
Talfan Evans · Neil Burgess -
2019 Oral: A neurally plausible model learns successor representations in partially observable environments »
Eszter Vértes · Maneesh Sahani -
2019 Poster: Kernel Instrumental Variable Regression »
Rahul Singh · Maneesh Sahani · Arthur Gretton -
2019 Oral: Kernel Instrumental Variable Regression »
Rahul Singh · Maneesh Sahani · Arthur Gretton -
2018 Poster: Flexible and accurate inference and learning for deep generative models »
Eszter Vértes · Maneesh Sahani -
2018 Poster: Human-in-the-Loop Interpretability Prior »
Isaac Lage · Andrew Ross · Samuel J Gershman · Been Kim · Finale Doshi-Velez -
2018 Spotlight: Human-in-the-Loop Interpretability Prior »
Isaac Lage · Andrew Ross · Samuel J Gershman · Been Kim · Finale Doshi-Velez -
2018 Poster: Temporal alignment and latent Gaussian process factor inference in population spike trains »
Lea Duncker · Maneesh Sahani -
2016 Poster: Probing the Compositionality of Intuitive Functions »
Eric Schulz · Josh Tenenbaum · David Duvenaud · Maarten Speekenbrink · Samuel J Gershman -
2015 Workshop: Bounded Optimality and Rational Metareasoning »
Samuel J Gershman · Falk Lieder · Tom Griffiths · Noah Goodman -
2015 Poster: Bayesian Manifold Learning: The Locally Linear Latent Variable Model (LL-LVM) »
Mijung Park · Wittawat Jitkrittum · Ahmad Qamar · Zoltan Szabo · Lars Buesing · Maneesh Sahani -
2014 Poster: Design Principles of the Hippocampal Cognitive Map »
Kimberly Stachenfeld · Matthew Botvinick · Samuel J Gershman -
2014 Spotlight: Design Principles of the Hippocampal Cognitive Map »
Kimberly Stachenfeld · Matthew Botvinick · Samuel J Gershman -
2013 Workshop: Acquiring and Analyzing the Activity of Large Neural Ensembles »
Srinivas C Turaga · Lars Buesing · Maneesh Sahani · Jakob H Macke -
2013 Poster: Extracting regions of interest from biological images with convolutional sparse block coding »
Marius Pachitariu · Adam M Packer · Noah Pettit · Henry Dalgleish · Michael Hausser · Maneesh Sahani -
2013 Poster: Recurrent linear models of simultaneously-recorded neural populations »
Marius Pachitariu · Biljana Petreska · Maneesh Sahani -
2013 Spotlight: Recurrent linear models of simultaneously-recorded neural populations »
Marius Pachitariu · Biljana Petreska · Maneesh Sahani -
2012 Poster: Spectral learning of linear dynamics from generalised-linear observations with application to neural population data »
Lars Buesing · Jakob H Macke · Maneesh Sahani -
2012 Oral: Spectral learning of linear dynamics from generalised-linear observations with application to neural population data »
Lars Buesing · Jakob H Macke · Maneesh Sahani -
2012 Poster: Learning visual motion in recurrent neural networks »
Marius Pachitariu · Maneesh Sahani -
2011 Oral: Empirical models of spiking in neural populations »
Jakob H Macke · Lars Buesing · John P Cunningham · Byron M Yu · Krishna V Shenoy · Maneesh Sahani -
2011 Poster: Empirical models of spiking in neural populations »
Jakob H Macke · Lars Buesing · John P Cunningham · Byron M Yu · Krishna V Shenoy · Maneesh Sahani -
2011 Poster: Dynamical segmentation of single trials from population neural data »
Biljana Petreska · Byron M Yu · John P Cunningham · Gopal Santhanam · Stephen I Ryu · Krishna V Shenoy · Maneesh Sahani -
2011 Poster: Probabilistic amplitude and frequency demodulation »
Richard Turner · Maneesh Sahani -
2011 Spotlight: Probabilistic amplitude and frequency demodulation »
Richard Turner · Maneesh Sahani -
2010 Session: The Sam Roweis Symposium »
Maneesh Sahani -
2010 Poster: The Neural Costs of Optimal Control »
Samuel J Gershman · Robert C Wilson -
2009 Poster: Occlusive Components Analysis »
Jörg Lücke · Richard Turner · Maneesh Sahani · Marc Henniges -
2009 Poster: Perceptual Multistability as Markov Chain Monte Carlo Inference »
Samuel J Gershman · Edward Vul · Josh Tenenbaum -
2009 Spotlight: Perceptual Multistability as Markov Chain Monte Carlo Inference »
Samuel J Gershman · Edward Vul · Josh Tenenbaum -
2009 Poster: A Bayesian Analysis of Dynamics in Free Recall »
Richard Socher · Samuel J Gershman · Adler Perotte · Per Sederberg · David Blei · Kenneth Norman -
2008 Poster: Gaussian-process factor analysis for low-dimensional single-trial analysis of neural population activity »
Byron M Yu · John P Cunningham · Gopal Santhanam · Stephen I Ryu · Krishna V Shenoy · Maneesh Sahani -
2007 Workshop: Beyond Simple Cells: Probabilistic Models for Visual Cortical Processing »
Richard Turner · Pietro Berkes · Maneesh Sahani -
2007 Oral: Inferring Elapsed Time from Stochastic Neural Processes »
Misha B Ahrens · Maneesh Sahani -
2007 Spotlight: Inferring Neural Firing Rates from Spike Trains Using Gaussian Processes »
John P Cunningham · Byron M Yu · Krishna V Shenoy · Maneesh Sahani -
2007 Poster: Inferring Neural Firing Rates from Spike Trains Using Gaussian Processes »
John P Cunningham · Byron M Yu · Krishna V Shenoy · Maneesh Sahani -
2007 Poster: Inferring Elapsed Time from Stochastic Neural Processes »
Misha B Ahrens · Maneesh Sahani -
2007 Poster: Modeling Natural Sounds with Modulation Cascade Processes »
Richard Turner · Maneesh Sahani -
2007 Poster: On Sparsity and Overcompleteness in Image Models »
Pietro Berkes · Richard Turner · Maneesh Sahani