Timezone: »

Unsupervised Curricula for Visual Meta-Reinforcement Learning
Allan Jabri · Kyle Hsu · Abhishek Gupta · Benjamin Eysenbach · Sergey Levine · Chelsea Finn

Wed Dec 11 10:30 AM -- 10:35 AM (PST) @ West Exhibition Hall A

Meta-reinforcement learning algorithms leverage experience across many tasks to learn fast and effective reinforcement learning (RL) algorithms. However, current meta-RL methods depend critically on a manually-defined distribution of meta-training tasks, and hand-crafting these task distributions is challenging and time-consuming. We develop an unsupervised algorithm for inducing an adaptive meta-training task distribution, i.e. an automatic curriculum, by modeling unsupervised interaction in a visual environment. Crucially, the task distribution is scaffolded by the meta-learner's behavior, with density-based exploration driving the evolution of the task distribution. We formulate unsupervised meta-RL with an information-theoretic objective optimized via expectation-maximization over trajectory-level latent variables. Repeating this procedure leads to iterative reorganization of behavior, allowing the task distribution to adapt as the meta-learner becomes more competent. In our experiments on vision-based navigation and manipulation domains, we show that our algorithm allows for unsupervised meta-learning of skills that transfer to downstream tasks specified by human-provided reward functions, as well as pre-training for more efficient meta-learning on user-defined task distributions. To understand the nature of the curricula, we provide visualizations and analysis of the task distributions discovered throughout the learning process, finding that the emergent tasks span a range of environment-specific exploratory and exploitative behavior.

Author Information

Allan Jabri (UC Berkeley)
Kyle Hsu (University of Toronto)
Abhishek Gupta (University of California, Berkeley)
Benjamin Eysenbach (Carnegie Mellon University)
Benjamin Eysenbach

I'm a 5th year PhD student at CMU, focusing on RL algorithms. I am currently on the faculty job market.

Sergey Levine (UC Berkeley)
Chelsea Finn (Stanford University)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors