Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 6th Robot Learning Workshop: Pretraining, Fine-Tuning, and Generalization with Large Scale Models

Robotic Offline RL from Internet Videos via Value-Function Pre-Training

Chethan Bhateja · Derek Guo · Dibya Ghosh · Anikait Singh · Manan Tomar · Quan Vuong · Yevgen Chebotar · Sergey Levine · Aviral Kumar

Keywords: [ offline RL ] [ videos ] [ value functions ] [ robotic learning ]


Abstract:

Pre-training on Internet data has proven to be a key ingredient for broad generalization in many modern ML systems. For robotics applications, data remains limited and video, the largest prior source of data available, offers observation-only experience without the action or reward annotations that cannot easily be incorporated in robotic learning methods. In this paper, we develop a system for leveraging large-scale human video datasets in robotic offline RL, based entirely on learning value functions via temporal-difference learning. We show that value learning on video datasets learns representations that are more conducive to downstream robotic offline RL than other approaches for learning from video data. Our system, called V-PTR, combines the benefits of pre-training on video data with robotic offline RL approaches that train on diverse robot data, resulting policies that perform better, act robustly, and generalize broadly. On several manipulation tasks on a real WidowX robot, our framework produces policies that greatly improve over prior methods. Videos can be found at https://sites.google.com/view/v-ptr.

Chat is not available.