MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations
Nicklas Hansen · Yixin Lin · Hao Su · Xiaolong Wang · Vikash Kumar · Aravind Rajeswaran
Poor sample efficiency continues to be the primary challenge for deployment of deep Reinforcement Learning (RL) algorithms for real-world applications, and in particular for visuo-motor control. Model-based RL has the potential to be highly sample efficient by concurrently learning a world model and using synthetic rollouts for planning and policy improvement. However, in practice, sample-efficient learning with model-based RL is bottlenecked by the exploration challenge. In this work, we find that leveraging just a handful of demonstrations can dramatically improve the sample-efficiency of model-based RL. Simply appending demonstrations to the interaction dataset, however, does not suffice. We identify key ingredients for leveraging demonstrations in model learning -- policy pretraining, targeted exploration, and oversampling of demonstration data -- which forms the three phases of our model-based RL framework. We empirically study three complex visuo-motor control domains and find that our method is 160%-250%more successful in completing sparse reward tasks compared to prior approaches in the low data regime (100K interaction steps, 5 demonstrations).

Author Information

Nicklas Hansen (UC San Diego)
Yixin Lin (Facebook AI Research)
Hao Su (UCSD)
Xiaolong Wang (UC San Diego)
Vikash Kumar (FAIR, Meta-AI)
Vikash Kumar

I am currently a research scientist at Facebook AI Research (FAIR). I have also spent some time at Google-Brain, OpenAI and Berkeley Artificial Intelligence Research (BAIR) Lab. I did my PhD at CSE, University of Washington's Movement Control Lab, under the supervision of Prof. Emanuel Todorov and Prof. Sergey Levine. I am interested in the areas of Robotics, and Embodied Artificial Intelligence. My general interest lies in developing artificial agents that are cheap, portable and exhibit complex behaviors.

Aravind Rajeswaran (FAIR)

