Timezone: »
Exploration in environments which differ across episodes has received increasing attention in recent years. Current methods use some combination of global novelty bonuses, computed using the agent's entire training experience, and episodic novelty bonuses, computed using only experience from the current episode. However, the use of these two types of bonuses has been ad-hoc and poorly understood. In this work, we first shed light on the behavior these two kinds of bonuses on hard exploration tasks through easily interpretable examples. We find that the two types of bonuses succeed in different settings, with episodic bonuses being most effective when there is little shared structure between environments and global bonuses being effective when more structure is shared. We also find that combining the two bonuses leads to more robust behavior across both of these settings. Motivated by these findings, we then investigate different algorithmic choices for defining and combining function approximation-based global and episodic bonuses. This results in a new algorithm which sets a new state of the art across 18 tasks from the MiniHack suite used in prior work.
Author Information
Mikael Henaff (Facebook AI Research)
Minqi Jiang (UCL & FAIR)
Roberta Raileanu (FAIR)
More from the Same Authors
-
2021 : MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research »
Mikayel Samvelyan · Robert Kirk · Vitaly Kurin · Jack Parker-Holder · Minqi Jiang · Eric Hambro · Fabio Petroni · Heinrich Kuttler · Edward Grefenstette · Tim Rocktäschel -
2021 : Grounding Aleatoric Uncertainty in Unsupervised Environment Design »
Minqi Jiang · Michael Dennis · Jack Parker-Holder · Andrei Lupu · Heinrich Kuttler · Edward Grefenstette · Tim Rocktäschel · Jakob Foerster -
2021 : Imitation Learning from Pixel Observations for Continuous Control »
Samuel Cohen · Brandon Amos · Marc Deisenroth · Mikael Henaff · Eugene Vinitsky · Denis Yarats -
2021 : That Escalated Quickly: Compounding Complexity by Editing Levels at the Frontier of Agent Capabilities »
Jack Parker-Holder · Minqi Jiang · Michael Dennis · Mikayel Samvelyan · Jakob Foerster · Edward Grefenstette · Tim Rocktäschel -
2021 : Return Dispersion as an Estimator of Learning Potential for Prioritized Level Replay »
Iryna Korshunova · Minqi Jiang · Jack Parker-Holder · Tim Rocktäschel · Edward Grefenstette -
2022 : Building a Subspace of Policies for Scalable Continual Learning »
Jean-Baptiste Gaya · Thang Long Doan · Lucas Page-Caccia · Laure Soulier · Ludovic Denoyer · Roberta Raileanu -
2022 : Uncertainty-Driven Exploration for Generalization in Reinforcement Learning »
Yiding Jiang · J. Zico Kolter · Roberta Raileanu -
2022 : MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning »
Mikayel Samvelyan · Akbir Khan · Michael Dennis · Minqi Jiang · Jack Parker-Holder · Jakob Foerster · Roberta Raileanu · Tim Rocktäschel -
2022 Poster: Dungeons and Data: A Large-Scale NetHack Dataset »
Eric Hambro · Roberta Raileanu · Danielle Rothermel · Vegard Mella · Tim Rocktäschel · Heinrich Küttler · Naila Murray -
2022 Poster: Grounding Aleatoric Uncertainty for Unsupervised Environment Design »
Minqi Jiang · Michael Dennis · Jack Parker-Holder · Andrei Lupu · Heinrich Küttler · Edward Grefenstette · Tim Rocktäschel · Jakob Foerster -
2022 Poster: Exploration via Elliptical Episodic Bonuses »
Mikael Henaff · Roberta Raileanu · Minqi Jiang · Tim Rocktäschel -
2022 Poster: GriddlyJS: A Web IDE for Reinforcement Learning »
Christopher Bamford · Minqi Jiang · Mikayel Samvelyan · Tim Rocktäschel -
2022 Poster: Improving Intrinsic Exploration with Language Abstractions »
Jesse Mu · Victor Zhong · Roberta Raileanu · Minqi Jiang · Noah Goodman · Tim Rocktäschel · Edward Grefenstette -
2021 : The NetHack Challenge + Q&A »
Eric Hambro · Sharada Mohanty · Dipam Chakrabroty · Edward Grefenstette · Minqi Jiang · Robert Kirk · Vitaly Kurin · Heinrich Kuttler · Vegard Mella · Nantas Nardelli · Jack Parker-Holder · Roberta Raileanu · Tim Rocktäschel · Danielle Rothermel · Mikayel Samvelyan -
2021 Poster: Replay-Guided Adversarial Environment Design »
Minqi Jiang · Michael Dennis · Jack Parker-Holder · Jakob Foerster · Edward Grefenstette · Tim Rocktäschel -
2019 Poster: Explicit Explore-Exploit Algorithms in Continuous State Spaces »
Mikael Henaff -
2018 : Poster Sessions and Lunch (Provided) »
Akira Utsumi · Alane Suhr · Ji Zhang · Ramon Sanabria · Kushal Kafle · Nicholas Chen · Seung Wook Kim · Aishwarya Agrawal · SRI HARSHA DUMPALA · Shikhar Murty · Pablo Azagra · Jean ROUAT · Alaaeldin Ali · · SUBBAREDDY OOTA · Angela Lin · Shruti Palaskar · Farley Lai · Amir Aly · Tingke Shen · Dianqi Li · Jianguo Zhang · Rita Kuznetsova · Jinwon An · Jean-Benoit Delbrouck · Tomasz Kornuta · Syed Ashar Javed · Christopher Davis · John Co-Reyes · Vasu Sharma · Sungwon Lyu · Ning Xie · Ankita Kalra · Huan Ling · Oleksandr Maksymets · Bhavana Mahendra Jain · Shun-Po Chuang · Sanyam Agarwal · Jerome Abdelnour · Yufei Feng · vincent albouy · Siddharth Karamcheti · Derek Doran · Roberta Raileanu · Jonathan Heek -
2017 : Contributed Talks 2 »
Roberta Raileanu · Satwik Kottur · Paul Grouchy