Timezone: »
Model-based reinforcement learning (RL) methods are appealing in the offline setting because they allow an agent to reason about the consequences of actions without interacting with the environment. Prior methods learn a 1-step dynamics model, which predicts the next state given the current state and action. These models do not immediately tell the agent which actions to take, but must be integrated into a larger RL framework. Can we model the environment dynamics in a different way, such that the learned model does directly indicate the value of each action? In this paper, we propose Contrastive Value Learning (CVL), which learns an implicit, multi-step model of the environment dynamics. This model can be learned without access to reward functions, but nonetheless can be used to directly estimate the value of each action, without requiring any TD learning. Because this model represents the multi-step transitions implicitly, it avoids having to predict high-dimensional observations and thus scales to high-dimensional tasks. Our experiments demonstrate that CVL outperforms prior offline RL methods on complex continuous control benchmarks.
Author Information
Bogdan Mazoure (McGill University, Google Brain)
Ph.D. student at MILA / McGill University, supervised by Doina Precup and Devon Hjelm. Interested in reinforcement learning, representation learning, mathematical statistics and density estimation.
Benjamin Eysenbach (CMU)

Assistant professor at Princeton working on self-supervised reinforcement learning (scaling, algorithms, theory, and applications).
Ofir Nachum (Google Brain)
Jonathan Tompson (Google Brain)
More from the Same Authors
-
2021 Spotlight: Robust Predictable Control »
Ben Eysenbach · Russ Salakhutdinov · Sergey Levine -
2021 : C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks »
Tianjun Zhang · Ben Eysenbach · Russ Salakhutdinov · Sergey Levine · Joseph Gonzalez -
2021 : Implicit Behavioral Cloning »
Pete Florence · Corey Lynch · Andy Zeng · Oscar Ramirez · Ayzaan Wahid · Laura Downs · Adrian Wong · Igor Mordatch · Jonathan Tompson -
2021 : The Information Geometry of Unsupervised Reinforcement Learning »
Ben Eysenbach · Russ Salakhutdinov · Sergey Levine -
2021 : Mismatched No More: Joint Model-Policy Optimization for Model-Based RL »
Ben Eysenbach · Alexander Khazatsky · Sergey Levine · Russ Salakhutdinov -
2021 : Improving Zero-shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions »
Bogdan Mazoure · Ilya Kostrikov · Ofir Nachum · Jonathan Tompson -
2021 : TRAIL: Near-Optimal Imitation Learning with Suboptimal Data »
Mengjiao (Sherry) Yang · Sergey Levine · Ofir Nachum -
2021 : Why so pessimistic? Estimating uncertainties for offline rl through ensembles, and why their independence matters »
Kamyar Ghasemipour · Shixiang (Shane) Gu · Ofir Nachum -
2022 : A Mixture-of-Expert Approach to RL-based Dialogue Management »
Yinlam Chow · Azamat Tulepbergenov · Ofir Nachum · Dhawal Gupta · Moonkyung Ryu · Mohammad Ghavamzadeh · Craig Boutilier -
2022 : Multi-Environment Pretraining Enables Transfer to Action Limited Datasets »
David Venuto · Mengjiao (Sherry) Yang · Pieter Abbeel · Doina Precup · Igor Mordatch · Ofir Nachum -
2022 : Skill Acquisition by Instruction Augmentation on Offline Datasets »
Ted Xiao · Harris Chan · Pierre Sermanet · Ayzaan Wahid · Anthony Brohan · Karol Hausman · Sergey Levine · Jonathan Tompson -
2022 : Interactive Language: Talking to Robots in Real Time »
Corey Lynch · Pete Florence · Jonathan Tompson · Ayzaan Wahid · Tianli Ding · James Betker · Robert Baruch · Travis Armstrong -
2022 : Robotic Skill Acquistion via Instruction Augmentation with Vision-Language Models »
Ted Xiao · Harris Chan · Pierre Sermanet · Ayzaan Wahid · Anthony Brohan · Karol Hausman · Sergey Levine · Jonathan Tompson -
2022 : Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group Shifts »
Amrith Setlur · Don Dennis · Benjamin Eysenbach · Aditi Raghunathan · Chelsea Finn · Virginia Smith · Sergey Levine -
2022 : Contrastive Example-Based Control »
Kyle Hatch · Sarthak J Shetty · Benjamin Eysenbach · Tianhe Yu · Rafael Rafailov · Russ Salakhutdinov · Sergey Levine · Chelsea Finn -
2022 : A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning »
Benjamin Eysenbach · Matthieu Geist · Sergey Levine · Russ Salakhutdinov -
2022 : Contrastive Example-Based Control »
Kyle Hatch · Sarthak J Shetty · Benjamin Eysenbach · Tianhe Yu · Rafael Rafailov · Russ Salakhutdinov · Sergey Levine · Chelsea Finn -
2022 : A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning »
Benjamin Eysenbach · Matthieu Geist · Russ Salakhutdinov · Sergey Levine -
2022 : Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective »
Raj Ghugare · Homanga Bharadhwaj · Benjamin Eysenbach · Sergey Levine · Ruslan Salakhutdinov -
2022 : Interactive Language: Talking to Robots in Real Time »
Corey Lynch · Pete Florence · Jonathan Tompson · Ayzaan Wahid · Tianli Ding · James Betker · Robert Baruch · Travis Armstrong -
2022 : Robotic Skill Acquistion via Instruction Augmentation with Vision-Language Models »
Ted Xiao · Harris Chan · Pierre Sermanet · Ayzaan Wahid · Anthony Brohan · Karol Hausman · Sergey Levine · Jonathan Tompson -
2023 Workshop: 6th Robot Learning Workshop: Pretraining, Fine-Tuning, and Generalization with Large Scale Models »
Dhruv Shah · Paula Wulkop · Claas Voelcker · Georgia Chalvatzaki · Alex Bewley · Hamidreza Kasaei · Ransalu Senanayake · Julien PEREZ · Jonathan Tompson -
2023 Workshop: The NeurIPS 2023 Workshop on Goal-Conditioned Reinforcement Learning »
Benjamin Eysenbach · Ishan Durugkar · Jason Yecheng Ma · Andi Peng · Tongzhou Wang · Amy Zhang -
2022 Workshop: 5th Robot Learning Workshop: Trustworthy Robotics »
Alex Bewley · Roberto Calandra · Anca Dragan · Igor Gilitschenski · Emily Hannigan · Masha Itkina · Hamidreza Kasaei · Jens Kober · Danica Kragic · Nathan Lambert · Julien PEREZ · Fabio Ramos · Ransalu Senanayake · Jonathan Tompson · Vincent Vanhoucke · Markus Wulfmeier -
2022 Workshop: Foundation Models for Decision Making »
Mengjiao (Sherry) Yang · Yilun Du · Jack Parker-Holder · Siddharth Karamcheti · Igor Mordatch · Shixiang (Shane) Gu · Ofir Nachum -
2022 Poster: Oracle Inequalities for Model Selection in Offline Reinforcement Learning »
Jonathan N Lee · George Tucker · Ofir Nachum · Bo Dai · Emma Brunskill -
2022 Poster: Learning Options via Compression »
Yiding Jiang · Evan Liu · Benjamin Eysenbach · J. Zico Kolter · Chelsea Finn -
2022 Poster: Chain of Thought Imitation with Procedure Cloning »
Mengjiao (Sherry) Yang · Dale Schuurmans · Pieter Abbeel · Ofir Nachum -
2022 Poster: Multi-Game Decision Transformers »
Kuang-Huei Lee · Ofir Nachum · Mengjiao (Sherry) Yang · Lisa Lee · Daniel Freeman · Sergio Guadarrama · Ian Fischer · Winnie Xu · Eric Jang · Henryk Michalewski · Igor Mordatch -
2022 Poster: Adversarial Unlearning: Reducing Confidence Along Adversarial Directions »
Amrith Setlur · Benjamin Eysenbach · Virginia Smith · Sergey Levine -
2022 Poster: Mismatched No More: Joint Model-Policy Optimization for Model-Based RL »
Benjamin Eysenbach · Alexander Khazatsky · Sergey Levine · Russ Salakhutdinov -
2022 Poster: Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters »
Kamyar Ghasemipour · Shixiang (Shane) Gu · Ofir Nachum -
2022 Poster: Contrastive Learning as Goal-Conditioned Reinforcement Learning »
Benjamin Eysenbach · Tianjun Zhang · Sergey Levine · Russ Salakhutdinov -
2022 Poster: Imitating Past Successes can be Very Suboptimal »
Benjamin Eysenbach · Soumith Udatha · Russ Salakhutdinov · Sergey Levine -
2022 Poster: Improving Zero-Shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions »
Bogdan Mazoure · Ilya Kostrikov · Ofir Nachum · Jonathan Tompson -
2021 : Implicit Behavioral Cloning Q&A »
Pete Florence · Corey Lynch · Andy Zeng · Oscar Ramirez · Ayzaan Wahid · Laura Downs · Adrian Wong · Igor Mordatch · Jonathan Tompson -
2021 : Implicit Behavioral Cloning »
Pete Florence · Corey Lynch · Andy Zeng · Oscar Ramirez · Ayzaan Wahid · Laura Downs · Adrian Wong · Igor Mordatch · Jonathan Tompson -
2021 Oral: Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification »
Ben Eysenbach · Sergey Levine · Russ Salakhutdinov -
2021 Poster: Robust Predictable Control »
Ben Eysenbach · Russ Salakhutdinov · Sergey Levine -
2021 Poster: Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification »
Ben Eysenbach · Sergey Levine · Russ Salakhutdinov -
2020 : Contributed Talk: MaxEnt RL and Robust Control »
Benjamin Eysenbach · Sergey Levine -
2020 Poster: Weakly-Supervised Reinforcement Learning for Controllable Behavior »
Lisa Lee · Benjamin Eysenbach · Russ Salakhutdinov · Shixiang (Shane) Gu · Chelsea Finn -
2020 Poster: Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement »
Benjamin Eysenbach · XINYANG GENG · Sergey Levine · Russ Salakhutdinov -
2020 Oral: Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement »
Benjamin Eysenbach · XINYANG GENG · Sergey Levine · Russ Salakhutdinov -
2020 Poster: Deep Reinforcement and InfoMax Learning »
Bogdan Mazoure · Remi Tachet des Combes · Thang Long Doan · Philip Bachman · R Devon Hjelm -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Poster Presentations »
Rahul Mehta · Andrew Lampinen · Binghong Chen · Sergio Pascual-Diaz · Jordi Grau-Moya · Aldo Faisal · Jonathan Tompson · Yiren Lu · Khimya Khetarpal · Martin Klissarov · Pierre-Luc Bacon · Doina Precup · Thanard Kurutach · Aviv Tamar · Pieter Abbeel · Jinke He · Maximilian Igl · Shimon Whiteson · Wendelin Boehmer · Raphaël Marinier · Olivier Pietquin · Karol Hausman · Sergey Levine · Chelsea Finn · Tianhe Yu · Lisa Lee · Benjamin Eysenbach · Emilio Parisotto · Eric Xing · Ruslan Salakhutdinov · Hongyu Ren · Anima Anandkumar · Deepak Pathak · Christopher Lu · Trevor Darrell · Alexei Efros · Phillip Isola · Feng Liu · Bo Han · Gang Niu · Masashi Sugiyama · Saurabh Kumar · Janith Petangoda · Johan Ferret · James McClelland · Kara Liu · Animesh Garg · Robert Lange -
2019 : Poster Session »
Matthia Sabatelli · Adam Stooke · Amir Abdi · Paulo Rauber · Leonard Adolphs · Ian Osband · Hardik Meisheri · Karol Kurach · Johannes Ackermann · Matt Benatan · GUO ZHANG · Chen Tessler · Dinghan Shen · Mikayel Samvelyan · Riashat Islam · Murtaza Dalal · Luke Harries · Andrey Kurenkov · Konrad Żołna · Sudeep Dasari · Kristian Hartikainen · Ofir Nachum · Kimin Lee · Markus Holzleitner · Vu Nguyen · Francis Song · Christopher Grimm · Felipe Leno da Silva · Yuping Luo · Yifan Wu · Alex Lee · Thomas Paine · Wei-Yang Qu · Daniel Graves · Yannis Flet-Berliac · Yunhao Tang · Suraj Nair · Matthew Hausknecht · Akhil Bagaria · Simon Schmitt · Bowen Baker · Paavo Parmas · Benjamin Eysenbach · Lisa Lee · Siyu Lin · Daniel Seita · Abhishek Gupta · Riley Simmons-Edler · Yijie Guo · Kevin Corder · Vikash Kumar · Scott Fujimoto · Adam Lerer · Ignasi Clavera Gilaberte · Nicholas Rhinehart · Ashvin Nair · Ge Yang · Lingxiao Wang · Sungryull Sohn · J. Fernando Hernandez-Garcia · Xian Yeow Lee · Rupesh Srivastava · Khimya Khetarpal · Chenjun Xiao · Luckeciano Carvalho Melo · Rishabh Agarwal · Tianhe Yu · Glen Berseth · Devendra Singh Chaplot · Jie Tang · Anirudh Srinivasan · Tharun Kumar Reddy Medini · Aaron Havens · Misha Laskin · Asier Mujika · Rohan Saphal · Joseph Marino · Alex Ray · Joshua Achiam · Ajay Mandlekar · Zhuang Liu · Danijar Hafner · Zhiwen Tang · Ted Xiao · Michael Walton · Jeff Druce · Ferran Alet · Zhang-Wei Hong · Stephanie Chan · Anusha Nagabandi · Hao Liu · Hao Sun · Ge Liu · Dinesh Jayaraman · John Co-Reyes · Sophia Sanborn -
2019 : Poster Spotlight 2 »
Aaron Sidford · Mengdi Wang · Lin Yang · Yinyu Ye · Zuyue Fu · Zhuoran Yang · Yongxin Chen · Zhaoran Wang · Ofir Nachum · Bo Dai · Ilya Kostrikov · Dale Schuurmans · Ziyang Tang · Yihao Feng · Lihong Li · Denny Zhou · Qiang Liu · Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Simon Du · Sham Kakade · Ruosong Wang · Minshuo Chen · Tianyi Liu · Xingguo Li · Zhaoran Wang · Tuo Zhao · Philip Amortila · Doina Precup · Prakash Panangaden · Marc Bellemare -
2019 : Contributed Talks »
Kevin Lu · Matthew Hausknecht · Ofir Nachum -
2019 : Coffee Break & Poster Session »
Samia Mohinta · Andrea Agostinelli · Alexandra Moringen · Jee Hang Lee · Yat Long Lo · Wolfgang Maass · Blue Sheffer · Colin Bredenberg · Benjamin Eysenbach · Liyu Xia · Efstratios Markou · Jan Lichtenberg · Pierre Richemond · Tony Zhang · JB Lanier · Baihan Lin · William Fedus · Glen Berseth · Marta Sarrico · Matthew Crosby · Stephen McAleer · Sina Ghiassian · Franz Scherr · Guillaume Bellec · Darjan Salaj · Arinbjörn Kolbeinsson · Matthew Rosenberg · Jaehoon Shin · Sang Wan Lee · Guillermo Cecchi · Irina Rish · Elias Hajek -
2019 Poster: Search on the Replay Buffer: Bridging Planning and Reinforcement Learning »
Benjamin Eysenbach · Russ Salakhutdinov · Sergey Levine -
2019 Poster: Unsupervised Curricula for Visual Meta-Reinforcement Learning »
Allan Jabri · Kyle Hsu · Abhishek Gupta · Benjamin Eysenbach · Sergey Levine · Chelsea Finn -
2019 Spotlight: Unsupervised Curricula for Visual Meta-Reinforcement Learning »
Allan Jabri · Kyle Hsu · Abhishek Gupta · Benjamin Eysenbach · Sergey Levine · Chelsea Finn -
2018 Poster: Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning »
Supasorn Suwajanakorn · Noah Snavely · Jonathan Tompson · Mohammad Norouzi -
2018 Oral: Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning »
Supasorn Suwajanakorn · Noah Snavely · Jonathan Tompson · Mohammad Norouzi -
2017 Poster: Bridging the Gap Between Value and Policy Based Reinforcement Learning »
Ofir Nachum · Mohammad Norouzi · Kelvin Xu · Dale Schuurmans