Timezone: »
Offline reinforcement learning (RL) algorithms have shown promising results in domains where abundant pre-collected data is available. However, prior methods focus on solving individual problems from scratch with an offline dataset without considering how an offline RL agent can acquire multiple skills. We argue that a natural use case of offline RL is in settings where we can pool large amounts of data collected in various scenarios for solving different tasks, and utilize all of this data to learn behaviors for all the tasks more effectively rather than training each one in isolation. However, sharing data across all tasks in multi-task offline RL performs surprisingly poorly in practice. Thorough empirical analysis, we find that sharing data can actually exacerbate the distributional shift between the learned policy and the dataset, which in turn can lead to divergence of the learned policy and poor performance. To address this challenge, we develop a simple technique for data- sharing in multi-task offline RL that routes data based on the improvement over the task-specific data. We call this approach conservative data sharing (CDS), and it can be applied with multiple single-task offline RL methods. On a range of challenging multi-task locomotion, navigation, and vision-based robotic manipulation problems, CDS achieves the best or comparable performance compared to prior offline multi- task RL methods and previous data sharing approaches.
Author Information
Tianhe Yu (Stanford University)
Aviral Kumar (UC Berkeley)
Yevgen Chebotar (University of Southern California)
Karol Hausman (Google Brain)
Sergey Levine (UC Berkeley)
Chelsea Finn (Stanford)
More from the Same Authors
-
2021 Spotlight: Robust Predictable Control »
Ben Eysenbach · Russ Salakhutdinov · Sergey Levine -
2021 Spotlight: Offline Reinforcement Learning as One Big Sequence Modeling Problem »
Michael Janner · Qiyang Li · Sergey Levine -
2021 Spotlight: Efficiently Identifying Task Groupings for Multi-Task Learning »
Chris Fifty · Ehsan Amid · Zhe Zhao · Tianhe Yu · Rohan Anil · Chelsea Finn -
2021 Spotlight: Pragmatic Image Compression for Human-in-the-Loop Decision-Making »
Sid Reddy · Anca Dragan · Sergey Levine -
2021 : MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance »
Michael Luo · Ashwin Balakrishna · Brijen Thananjeyan · Suraj Nair · Julian Ibarz · Jie Tan · Chelsea Finn · Ion Stoica · Ken Goldberg -
2021 : Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets »
Frederik Ebert · Yanlai Yang · Karl Schmeckpeper · Bernadette Bucher · Kostas Daniilidis · Chelsea Finn · Sergey Levine -
2021 : Lifelong Robotic Reinforcement Learning by Retaining Experiences »
Annie Xie · Chelsea Finn -
2021 : Correct-N-Contrast: A Contrastive Approach for Improving Robustness to Spurious Correlations »
Michael Zhang · Nimit Sohoni · Hongyang Zhang · Chelsea Finn · Christopher Ré -
2021 : Extending the WILDS Benchmark for Unsupervised Adaptation »
Shiori Sagawa · Pang Wei Koh · Tony Lee · Irena Gao · Sang Michael Xie · Kendrick Shen · Ananya Kumar · Weihua Hu · Michihiro Yasunaga · Henrik Marklund · Sara Beery · Ian Stavness · Jure Leskovec · Kate Saenko · Tatsunori Hashimoto · Sergey Levine · Chelsea Finn · Percy Liang -
2021 : Test Time Robustification of Deep Models via Adaptation and Augmentation »
Marvin Zhang · Sergey Levine · Chelsea Finn -
2021 : The Reflective Explorer: Online Meta-Exploration from Offline Data in Realistic Robotic Tasks »
Rafael Rafailov · · Tianhe Yu · Avi Singh · Mariano Phielipp · Chelsea Finn -
2021 : Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning »
· Ted Xiao · Alexander Toshev · Sergey Levine · brian ichter -
2021 : Data Sharing without Rewards in Multi-Task Offline Reinforcement Learning »
Tianhe Yu · Aviral Kumar · Yevgen Chebotar · Chelsea Finn · Sergey Levine · Karol Hausman -
2021 : Should I Run Offline Reinforcement Learning or Behavioral Cloning? »
Aviral Kumar · Joey Hong · Anikait Singh · Sergey Levine -
2021 : DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization »
Aviral Kumar · Rishabh Agarwal · Tengyu Ma · Aaron Courville · George Tucker · Sergey Levine -
2021 : Offline Reinforcement Learning with In-sample Q-Learning »
Ilya Kostrikov · Ashvin Nair · Sergey Levine -
2021 : C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks »
Tianjun Zhang · Ben Eysenbach · Russ Salakhutdinov · Sergey Levine · Joseph Gonzalez -
2021 : The Information Geometry of Unsupervised Reinforcement Learning »
Ben Eysenbach · Russ Salakhutdinov · Sergey Levine -
2021 : Mismatched No More: Joint Model-Policy Optimization for Model-Based RL »
Ben Eysenbach · Alexander Khazatsky · Sergey Levine · Russ Salakhutdinov -
2021 : Offline Meta-Reinforcement Learning with Online Self-Supervision »
Vitchyr Pong · Ashvin Nair · Laura Smith · Catherine Huang · Sergey Levine -
2021 : Hybrid Imitative Planning with Geometric and Predictive Costs in Offroad Environments »
Daniel Shin · · Ali Agha · Nicholas Rhinehart · Sergey Levine -
2021 : CoMPS: Continual Meta Policy Search »
Glen Berseth · Zhiwei Zhang · Grace Zhang · Chelsea Finn · Sergey Levine -
2021 : Discriminator Augmented Model-Based Reinforcement Learning »
Allan Zhou · Archit Sharma · Chelsea Finn -
2021 : Curiosity with Chelsea Finn, Celeste Kidd, Timothy Verstynen »
Celeste Kidd · Chelsea Finn · Timothy Verstynen · Johnathan Flowers -
2021 : Example-Based Offline Reinforcement Learning without Rewards »
Kyle Hatch · Tianhe Yu · Rafael Rafailov · Chelsea Finn -
2021 : The Reflective Explorer: Online Meta-Exploration from Offline Data in Realistic Robotic Tasks »
Rafael Rafailov · · Tianhe Yu · Avi Singh · Mariano Phielipp · Chelsea Finn -
2021 : Lifelong Robotic Reinforcement Learning by Retaining Experiences »
Annie Xie · Chelsea Finn -
2021 : Speaker Intro »
Aviral Kumar · George Tucker -
2021 : Speaker Intro »
Aviral Kumar · George Tucker -
2021 : Speaker Intro »
Rishabh Agarwal · Aviral Kumar -
2021 : Speaker Intro »
Rishabh Agarwal · Aviral Kumar -
2021 Workshop: Offline Reinforcement Learning »
Rishabh Agarwal · Aviral Kumar · George Tucker · Justin Fu · Nan Jiang · Doina Precup · Aviral Kumar -
2021 : Opening Remarks »
Rishabh Agarwal · Aviral Kumar -
2021 : Discussion: Chelsea Finn, Masashi Sugiyama »
Chelsea Finn · Masashi Sugiyama -
2021 : Robustness through the Lens of Invariance »
Chelsea Finn -
2021 : Karol Hausman Talk Q&A »
Karol Hausman -
2021 : Invited Talk: Karol Hausman - Reinforcement Learning as a Data Sponge »
Karol Hausman -
2021 : Offline Meta-Reinforcement Learning with Online Self-Supervision Q&A »
Vitchyr Pong · Ashvin Nair · Laura Smith · Catherine Huang · Sergey Levine -
2021 : Offline Meta-Reinforcement Learning with Online Self-Supervision »
Vitchyr Pong · Ashvin Nair · Laura Smith · Catherine Huang · Sergey Levine -
2021 : Offline Meta-Reinforcement Learning with Online Self-Supervision »
Vitchyr Pong · Ashvin Nair · Laura Smith · Catherine Huang · Sergey Levine -
2021 : DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization Q&A »
Aviral Kumar · Rishabh Agarwal · Tengyu Ma · Aaron Courville · George Tucker · Sergey Levine -
2021 : DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization »
Aviral Kumar · Rishabh Agarwal · Tengyu Ma · Aaron Courville · George Tucker · Sergey Levine -
2021 Workshop: Deep Reinforcement Learning »
Pieter Abbeel · Chelsea Finn · David Silver · Matthew Taylor · Martha White · Srijita Das · Yuqing Du · Andrew Patterson · Manan Tomar · Olivia Watkins -
2021 Oral: Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification »
Ben Eysenbach · Sergey Levine · Russ Salakhutdinov -
2021 Poster: Visual Adversarial Imitation Learning using Variational Models »
Rafael Rafailov · Tianhe Yu · Aravind Rajeswaran · Chelsea Finn -
2021 Poster: Robust Predictable Control »
Ben Eysenbach · Russ Salakhutdinov · Sergey Levine -
2021 Poster: Efficiently Identifying Task Groupings for Multi-Task Learning »
Chris Fifty · Ehsan Amid · Zhe Zhao · Tianhe Yu · Rohan Anil · Chelsea Finn -
2021 Poster: Which Mutual-Information Representation Learning Objectives are Sufficient for Control? »
Kate Rakelly · Abhishek Gupta · Carlos Florensa · Sergey Levine -
2021 Poster: COMBO: Conservative Offline Model-Based Policy Optimization »
Tianhe Yu · Aviral Kumar · Rafael Rafailov · Aravind Rajeswaran · Sergey Levine · Chelsea Finn -
2021 Poster: Outcome-Driven Reinforcement Learning via Variational Inference »
Tim G. J. Rudner · Vitchyr Pong · Rowan McAllister · Yarin Gal · Sergey Levine -
2021 Poster: Bayesian Adaptation for Covariate Shift »
Aurick Zhou · Sergey Levine -
2021 Poster: Offline Reinforcement Learning as One Big Sequence Modeling Problem »
Michael Janner · Qiyang Li · Sergey Levine -
2021 Poster: Pragmatic Image Compression for Human-in-the-Loop Decision-Making »
Sid Reddy · Anca Dragan · Sergey Levine -
2021 Poster: Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification »
Ben Eysenbach · Sergey Levine · Russ Salakhutdinov -
2021 Poster: Information is Power: Intrinsic Control via Information Capture »
Nicholas Rhinehart · Jenny Wang · Glen Berseth · John Co-Reyes · Danijar Hafner · Chelsea Finn · Sergey Levine -
2021 Poster: Meta-learning with an Adaptive Task Scheduler »
Huaxiu Yao · Yu Wang · Ying Wei · Peilin Zhao · Mehrdad Mahdavi · Defu Lian · Chelsea Finn -
2021 Poster: Noether Networks: meta-learning useful conserved quantities »
Ferran Alet · Dylan Doblar · Allan Zhou · Josh Tenenbaum · Kenji Kawaguchi · Chelsea Finn -
2021 Poster: Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability »
Dibya Ghosh · Jad Rahme · Aviral Kumar · Amy Zhang · Ryan Adams · Sergey Levine -
2021 Poster: Differentiable Annealed Importance Sampling and the Perils of Gradient Noise »
Guodong Zhang · Kyle Hsu · Jianing Li · Chelsea Finn · Roger Grosse -
2021 Poster: Autonomous Reinforcement Learning via Subgoal Curricula »
Archit Sharma · Abhishek Gupta · Sergey Levine · Karol Hausman · Chelsea Finn -
2021 Poster: Adaptive Risk Minimization: Learning to Adapt to Domain Shift »
Marvin Zhang · Henrik Marklund · Nikita Dhawan · Abhishek Gupta · Sergey Levine · Chelsea Finn -
2020 : Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization »
Brandon Trabucco · Aviral Kumar · XINYANG GENG · Sergey Levine -
2020 : Conservative Objective Models: A Simple Approach to Effective Model-Based Optimization »
Brandon Trabucco · Aviral Kumar · XINYANG GENG · Sergey Levine -
2020 : Panel Discussion & Closing »
Yejin Choi · Alexei Efros · Chelsea Finn · Kristen Grauman · Quoc V Le · Yann LeCun · Ruslan Salakhutdinov · Eric Xing -
2020 : QA: Chelsea Finn »
Chelsea Finn -
2020 : Mini-panel discussion 3 - Prioritizing Real World RL Challenges »
Chelsea Finn · Thomas Dietterich · Angela Schoellig · Anca Dragan · Anusha Nagabandi · Doina Precup -
2020 : Invited Talk: Chelsea Finn »
Chelsea Finn -
2020 : Keynote: Chelsea Finn »
Chelsea Finn -
2020 : Panel »
Emma Brunskill · Nan Jiang · Nando de Freitas · Finale Doshi-Velez · Sergey Levine · John Langford · Lihong Li · George Tucker · Rishabh Agarwal · Aviral Kumar -
2020 Workshop: Offline Reinforcement Learning »
Aviral Kumar · Rishabh Agarwal · George Tucker · Lihong Li · Doina Precup · Aviral Kumar -
2020 : Introduction »
Aviral Kumar · George Tucker · Rishabh Agarwal -
2020 : Invited talk - Underfitting and Uncertainty in Self-Supervised Predictive Models »
Chelsea Finn -
2020 Workshop: Deep Reinforcement Learning »
Pieter Abbeel · Chelsea Finn · Joelle Pineau · David Silver · Satinder Singh · Coline Devin · Misha Laskin · Kimin Lee · Janarthanan Rajendran · Vivek Veeriah -
2020 Poster: Weakly-Supervised Reinforcement Learning for Controllable Behavior »
Lisa Lee · Benjamin Eysenbach · Russ Salakhutdinov · Shixiang (Shane) Gu · Chelsea Finn -
2020 Poster: Continual Learning of Control Primitives : Skill Discovery via Reset-Games »
Kelvin Xu · Siddharth Verma · Chelsea Finn · Sergey Levine -
2020 Poster: Gradient Surgery for Multi-Task Learning »
Tianhe Yu · Saurabh Kumar · Abhishek Gupta · Sergey Levine · Karol Hausman · Chelsea Finn -
2020 Poster: Continuous Meta-Learning without Tasks »
James Harrison · Apoorva Sharma · Chelsea Finn · Marco Pavone -
2020 Poster: One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL »
Saurabh Kumar · Aviral Kumar · Sergey Levine · Chelsea Finn -
2020 Poster: Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors »
Karl Pertsch · Oleh Rybkin · Frederik Ebert · Shenghao Zhou · Dinesh Jayaraman · Chelsea Finn · Sergey Levine -
2020 Poster: MOPO: Model-based Offline Policy Optimization »
Tianhe Yu · Garrett Thomas · Lantao Yu · Stefano Ermon · James Zou · Sergey Levine · Chelsea Finn · Tengyu Ma -
2020 Tutorial: (Track3) Offline Reinforcement Learning: From Algorithm Design to Practical Applications »
Sergey Levine · Aviral Kumar -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Poster Presentations »
Rahul Mehta · Andrew Lampinen · Binghong Chen · Sergio Pascual-Diaz · Jordi Grau-Moya · Aldo Faisal · Jonathan Tompson · Yiren Lu · Khimya Khetarpal · Martin Klissarov · Pierre-Luc Bacon · Doina Precup · Thanard Kurutach · Aviv Tamar · Pieter Abbeel · Jinke He · Maximilian Igl · Shimon Whiteson · Wendelin Boehmer · Raphaël Marinier · Olivier Pietquin · Karol Hausman · Sergey Levine · Chelsea Finn · Tianhe Yu · Lisa Lee · Benjamin Eysenbach · Emilio Parisotto · Eric Xing · Ruslan Salakhutdinov · Hongyu Ren · Anima Anandkumar · Deepak Pathak · Christopher Lu · Trevor Darrell · Alexei Efros · Phillip Isola · Feng Liu · Bo Han · Gang Niu · Masashi Sugiyama · Saurabh Kumar · Janith Petangoda · Johan Ferret · James McClelland · Kara Liu · Animesh Garg · Robert Lange -
2019 Workshop: Learning with Rich Experience: Integration of Learning Paradigms »
Zhiting Hu · Andrew Wilson · Chelsea Finn · Lisa Lee · Taylor Berg-Kirkpatrick · Ruslan Salakhutdinov · Eric Xing -
2019 Poster: Graph Normalizing Flows »
Jenny Liu · Aviral Kumar · Jimmy Ba · Jamie Kiros · Kevin Swersky -
2019 Poster: Language as an Abstraction for Hierarchical Deep Reinforcement Learning »
YiDing Jiang · Shixiang (Shane) Gu · Kevin Murphy · Chelsea Finn -
2018 : Spotlight Talks I »
Juan Leni · Michael Spranger · Ben Bogin · Shane Steinert-Threlkeld · Nicholas Tomlin · Fushan Li · Michael Noukhovitch · Tushar Jain · Jason Lee · Yen-Ling Kuo · Josefina Correa · Karol Hausman -
2018 : Spotlights »
Guangneng Hu · Ke Li · Aviral Kumar · Phi Vu Tran · Samuel Fadel · Rita Kuznetsova · Bong-Nam Kang · Behrouz Haji Soleimani · Jinwon An · Nathan de Lara · Anjishnu Kumar · Tillman Weyde · Melanie Weber · Kristen Altenburger · Saeed Amizadeh · Xiaoran Xu · Yatin Nandwani · Yang Guo · Maria Pacheco · William Fedus · Guillaume Jaume · Yuka Yoneda · Yunpu Ma · Yunsheng Bai · Berk Kapicioglu · Maximilian Nickel · Fragkiskos Malliaros · Beier Zhu · Aleksandar Bojchevski · Joshua Joseph · Gemma Roig · Esma Balkir · Xander Steenbrugge -
2017 Poster: Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets »
Karol Hausman · Yevgen Chebotar · Stefan Schaal · Gaurav Sukhatme · Joseph Lim -
2017 Demonstration: Deep Robotic Learning using Visual Imagination and Meta-Learning »
Chelsea Finn · Frederik Ebert · Tianhe Yu · Annie Xie · Sudeep Dasari · Pieter Abbeel · Sergey Levine