Poster
in
Workshop: Foundation Models for Decision Making
CLaP: Conditional Latent Planners for Offline Reinforcement Learning
Harry Shin · Rose Wang
Abstract:
Recent work has formulated offline reinforcement learning (RL) as a sequence modeling problem, benefiting from the simplicity and scalability of the Transformer architecture. However, sequence models struggle to model trajectories that are long-horizon or involve complicated environment dynamics. We propose CLaP (Conditional Latent Planners) to learn a simple goal-conditioned latent space from offline agent behavior, and incrementally decode good actions from a latent plan. We evaluate our method on continuous control domains from the D4RL benchmark. Compared to non-sequential models and return-conditioned sequential models, CLaP shows competitive if not better performance across continuous control tasks. It particularly does better in environments with complex transition dynamics with up to performance increase. Our results suggest that decision-making is easier with simplified latent dynamics that models behavior as being goal-conditioned.
Chat is not available.