Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 6th Robot Learning Workshop: Pretraining, Fine-Tuning, and Generalization with Large Scale Models

Reasoning with Latent Diffusion in Offline Reinforcement Learning

Siddarth Venkatraman · Shivesh Khaitan · Ravi Tej Akella · John Dolan · Jeff Schneider · Glen Berseth

Keywords: [ diffusion ] [ Reinforcement Learning ]


Abstract:

Offline reinforcement learning (RL) holds promise as a means to learn high-reward policies from large static datasets, without need for further environment interactions. This is especially critical for robotics, where online learning can be prohibitively expensive. However, a key challenge in offline RL lies in effectively stitching portions of suboptimal trajectories from the static dataset while avoiding extrapolation errors arising due to a lack of support in the dataset. In this work, we propose a novel approach that leverages the expressiveness of latent diffusion to model in-support trajectory sequences as compressed latent skills. This facilitates learning a Q-function while avoiding extrapolation error via batch-constraining. The latent space is also expressive and gracefully copes with multi-modal data. We show that the learned temporally-abstract latent space encodes richer task-specific information for offline RL tasks as compared to raw state-actions. This improves credit assignment and facilitates faster reward propagation during Q-learning. Our method demonstrates state-of-the-art performance on the D4RL benchmarks, particularly excelling in long-horizon, sparse-reward tasks.

Chat is not available.