Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 6th Robot Learning Workshop: Pretraining, Fine-Tuning, and Generalization with Large Scale Models

TD-MPC2: Scalable, Robust World Models for Continuous Control

Nicklas Hansen · Hao Su · Xiaolong Wang

Keywords: [ World Models ] [ Reinforcement Learning ] [ Model-based Reinforcement Learning ]


Abstract:

TD-MPC is a model-based reinforcement learning (RL) algorithm that performs local trajectory optimization in the latent space of a learned implicit (decoder-free) world model. In this work, we present TD-MPC2: a series of improvements upon the TD-MPC algorithm. We demonstrate that TD-MPC2 improves significantly over baselines across 104 online RL tasks spanning 4 diverse task domains, achieving consistently strong results with a single set of hyperparameters. We further show that agent capabilities increase with model and data size, and successfully train a single 317M parameter agent to perform 80 tasks across multiple task domains, embodiments, and action spaces.Explore videos, models, data, code, and more at https://tdmpc2.com

Chat is not available.