Skip to yearly menu bar Skip to main content

Workshop: Deep Reinforcement Learning Workshop

Temporary Goals for Exploration

Haoyang Xu · Jimmy Ba · Silviu Pitis · Harris Chan


Exploration has always been a crucial aspect of reinforcement learning. When facing long horizon sparse reward environments modern methods still struggle with effective exploration and generalize poorly. In the multi-goal reinforcement learning setting, out-of-distribution goals might appear similar to the achieved ones, but the agent can never accurately assess its ability to achieve them without attempting them. To enable faster exploration and improve generalization, we propose an exploration method that lets the agent temporarily pursue the most meaningful nearby goal. We demonstrate the performance of our method through experiments in four multi-goal continuous navigation environments including a 2D PointMaze, an AntMaze, and a discrete multi-goal foraging world.

Chat is not available.