Timezone: »

Exploration via Elliptical Episodic Bonuses
Mikael Henaff · Roberta Raileanu · Minqi Jiang · Tim Rocktäschel

Wed Nov 30 09:30 AM -- 11:00 AM (PST) @ Hall J #105

In recent years, a number of reinforcement learning (RL) methods have been pro- posed to explore complex environments which differ across episodes. In this work, we show that the effectiveness of these methods critically relies on a count-based episodic term in their exploration bonus. As a result, despite their success in relatively simple, noise-free settings, these methods fall short in more realistic scenarios where the state space is vast and prone to noise. To address this limitation, we introduce Exploration via Elliptical Episodic Bonuses (E3B), a new method which extends count-based episodic bonuses to continuous state spaces and encourages an agent to explore states that are diverse under a learned embed- ding within each episode. The embedding is learned using an inverse dynamics model in order to capture controllable aspects of the environment. Our method sets a new state-of-the-art across 16 challenging tasks from the MiniHack suite, without requiring task-specific inductive biases. E3B also outperforms existing methods in reward-free exploration on Habitat, demonstrating that it can scale to high-dimensional pixel-based observations and realistic environments.

Author Information

Mikael Henaff (Facebook AI Research)
Roberta Raileanu (FAIR)
Minqi Jiang (UCL & FAIR)
Tim Rocktäschel (University College London, Facebook AI Research)

Tim is a Researcher at Facebook AI Research (FAIR) London, an Associate Professor at the Centre for Artificial Intelligence in the Department of Computer Science at University College London (UCL), and a Scholar of the European Laboratory for Learning and Intelligent Systems (ELLIS). Prior to that, he was a Postdoctoral Researcher in Reinforcement Learning at the University of Oxford, a Junior Research Fellow in Computer Science at Jesus College, and a Stipendiary Lecturer in Computer Science at Hertford College. Tim obtained his Ph.D. from UCL under the supervision of Sebastian Riedel, and he was awarded a Microsoft Research Ph.D. Scholarship in 2013 and a Google Ph.D. Fellowship in 2017. His work focuses on reinforcement learning in open-ended environments that require intrinsically motivated agents capable of transferring commonsense, world and domain knowledge in order to systematically generalize to novel situations.

More from the Same Authors