Timezone: »
Self-supervised pretraining has been extensively studied in language and vision domains, where a unified model can be easily adapted to various downstream tasks by pretraining representations without explicit labels. When it comes to sequential decision-making tasks, however, it is difficult to properly design such a pretraining approach that can cope with both high-dimensional perceptual information and the complexity of sequential control over long interaction horizons. The challenge becomes combinatorially more complex if we want to pretrain representations amenable to a large variety of tasks. To tackle this problem, in this work, we formulate a general pretraining-finetuning pipeline for sequential decision making, under which we propose a generic pretraining framework \textit{Self-supervised Multi-task pretrAining with contRol Transformer (SMART)}. By systematically investigating pretraining regimes, we carefully design a Control Transformer (CT) coupled with a novel control-centric pretraining objective in a self-supervised manner. SMART encourages the representation to capture the common essential information relevant to short-term control and long-term control, which is transferrable across tasks. We show by extensive experiments in DeepMind Control Suite that SMART significantly improves the learning efficiency among seen and unseen downstream tasks and domains under different learning scenarios including Imitation Learning (IL) and Reinforcement Learning (RL). Benefiting from the proposed control-centric objective, SMART is resilient to distribution shift between pretraining and finetuning, and even works well with low-quality pretraining datasets that are randomly collected.
Author Information
Yanchao Sun (University of Maryland, College Park)
shuang ma (Microsoft)
Ratnesh Madaan (Microsoft)
Rogerio Bonatti (Microsoft)
Furong Huang (University of Maryland)
Ashish Kapoor (Microsoft)
More from the Same Authors
-
2020 : Paper 64: Modeling Affect-based Intrinsic Rewards for Exploration and Learning »
Daniel McDuff · Ashish Kapoor -
2021 Spotlight: Representation Learning for Event-based Visuomotor Policies »
Sai Vemprala · Sami Mian · Ashish Kapoor -
2021 : Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL »
Yanchao Sun · Ruijie Zheng · Yongyuan Liang · Furong Huang -
2021 : Efficiently Improving the Robustness of RL Agents against Strongest Adversaries »
Yongyuan Liang · Yanchao Sun · Ruijie Zheng · Furong Huang -
2021 : Transfer RL across Observation Feature Spaces via Model-Based Regularization »
Yanchao Sun · Ruijie Zheng · Xiyao Wang · Andrew Cohen · Furong Huang -
2021 : Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL »
Yanchao Sun · Ruijie Zheng · Yongyuan Liang · Furong Huang -
2022 : PACT: Perception-Action Causal Transformer for Autoregressive Robotics Pretraining »
Rogerio Bonatti · Sai Vemprala · shuang ma · Felipe Vieira Frujeri · Shuhang Chen · Ashish Kapoor -
2022 : LATTE: LAnguage Trajectory TransformEr »
A Bucker · Luis Figueredo · Sami Haddadin · Ashish Kapoor · shuang ma · Sai Vemprala · Rogerio Bonatti -
2022 : Posterior Coreset Construction with Kernelized Stein Discrepancy for Model-Based Reinforcement Learning »
Souradip Chakraborty · Amrit Bedi · Alec Koppel · Furong Huang · Pratap Tokekar · Dinesh Manocha -
2022 : GFairHint: Improving Individual Fairness for Graph Neural Networks via Fairness Hint »
Paiheng Xu · Yuhang Zhou · Bang An · Wei Ai · Furong Huang -
2022 : Controllable Attack and Improved Adversarial Training in Multi-Agent Reinforcement Learning »
Xiangyu Liu · Souradip Chakraborty · Furong Huang -
2022 : Sketch-GNN: Scalable Graph Neural Networks with Sublinear Training Complexity »
Mucong Ding · Tahseen Rabbani · Bang An · Evan Wang · Furong Huang -
2022 : Faster Hyperparameter Search on Graphs via Calibrated Dataset Condensation »
Mucong Ding · Xiaoyu Liu · Tahseen Rabbani · Furong Huang -
2022 : DP-InstaHide: Data Augmentations Provably Enhance Guarantees Against Dataset Manipulations »
Eitan Borgnia · Jonas Geiping · Valeriia Cherepanova · Liam Fowl · Arjun Gupta · Amin Ghiasi · Furong Huang · Micah Goldblum · Tom Goldstein -
2022 : Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function »
Ruijie Zheng · Xiyao Wang · Huazhe Xu · Furong Huang -
2022 : Contributed Talk: Controllable Attack and Improved Adversarial Training in Multi-Agent Reinforcement Learning »
Xiangyu Liu · Souradip Chakraborty · Furong Huang -
2022 Spotlight: Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach »
Kaiwen Yang · Yanchao Sun · Jiahao Su · Fengxiang He · Xinmei Tian · Furong Huang · Tianyi Zhou · Dacheng Tao -
2022 : SWIFT: Rapid Decentralized Federated Learning via Wait-Free Model Communication »
Marco Bornstein · Tahseen Rabbani · Evan Wang · Amrit Bedi · Furong Huang -
2022 Poster: Where do Models go Wrong? Parameter-Space Saliency Maps for Explainability »
Roman Levin · Manli Shu · Eitan Borgnia · Furong Huang · Micah Goldblum · Tom Goldstein -
2022 Poster: Sketch-GNN: Scalable Graph Neural Networks with Sublinear Training Complexity »
Mucong Ding · Tahseen Rabbani · Bang An · Evan Wang · Furong Huang -
2022 Poster: Distributional Reward Estimation for Effective Multi-agent Deep Reinforcement Learning »
Jifeng Hu · Yanchao Sun · Hechang Chen · Sili Huang · haiyin piao · Yi Chang · Lichao Sun -
2022 Poster: Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning »
Yongyuan Liang · Yanchao Sun · Ruijie Zheng · Furong Huang -
2022 Poster: Learning Modular Simulations for Homogeneous Systems »
Jayesh Gupta · Sai Vemprala · Ashish Kapoor -
2022 Poster: End-to-end Algorithm Synthesis with Recurrent Networks: Extrapolation without Overthinking »
Arpit Bansal · Avi Schwarzschild · Eitan Borgnia · Zeyad Emam · Furong Huang · Micah Goldblum · Tom Goldstein -
2022 Poster: Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach »
Kaiwen Yang · Yanchao Sun · Jiahao Su · Fengxiang He · Xinmei Tian · Furong Huang · Tianyi Zhou · Dacheng Tao -
2022 Poster: Transferring Fairness under Distribution Shifts via Fair Consistency Regularization »
Bang An · Zora Che · Mucong Ding · Furong Huang -
2022 Poster: 3DB: A Framework for Debugging Computer Vision Models »
Guillaume Leclerc · Hadi Salman · Andrew Ilyas · Sai Vemprala · Logan Engstrom · Vibhav Vineet · Kai Xiao · Pengchuan Zhang · Shibani Santurkar · Greg Yang · Ashish Kapoor · Aleksander Madry -
2021 : Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL »
Yanchao Sun · Ruijie Zheng · Yongyuan Liang · Furong Huang -
2021 : Efficiently Improving the Robustness of RL Agents against Strongest Adversaries »
Yongyuan Liang · Yanchao Sun · Ruijie Zheng · Furong Huang -
2021 Poster: Contrastive Learning of Global and Local Video Representations »
shuang ma · Zhaoyang Zeng · Daniel McDuff · Yale Song -
2021 Poster: Representation Learning for Event-based Visuomotor Policies »
Sai Vemprala · Sami Mian · Ashish Kapoor -
2021 Poster: Unadversarial Examples: Designing Objects for Robust Vision »
Hadi Salman · Andrew Ilyas · Logan Engstrom · Sai Vemprala · Aleksander Madry · Ashish Kapoor -
2020 Poster: Do Adversarially Robust ImageNet Models Transfer Better? »
Hadi Salman · Andrew Ilyas · Logan Engstrom · Ashish Kapoor · Aleksander Madry -
2020 Oral: Do Adversarially Robust ImageNet Models Transfer Better? »
Hadi Salman · Andrew Ilyas · Logan Engstrom · Ashish Kapoor · Aleksander Madry -
2020 Poster: Denoised Smoothing: A Provable Defense for Pretrained Classifiers »
Hadi Salman · Mingjie Sun · Greg Yang · Ashish Kapoor · J. Zico Kolter -
2020 Poster: Multi-Robot Collision Avoidance under Uncertainty with Probabilistic Safety Barrier Certificates »
Wenhao Luo · Wen Sun · Ashish Kapoor -
2020 Spotlight: Multi-Robot Collision Avoidance under Uncertainty with Probabilistic Safety Barrier Certificates »
Wenhao Luo · Wen Sun · Ashish Kapoor -
2019 : The Game of Drones Competition »
Charbel Toumieh · Sai Vemprala · Sangyun Shin · Rahul Kumar · Andrey Ivanov · Hyunchul Shim · Jose Martinez-Carranza · Nicholas Gyde · Ashish Kapoor · Keiko Nagami · Tim Taubner · Ratnesh Madaan · Antony Gillette · Paul Stubbs -
2019 : Lunch + Poster Session »
Frederik Gerzer · Bill Yang Cai · Pieter-Jan Hoedt · Kelly Kochanski · Soo Kyung Kim · Yunsung Lee · Sunghyun Park · Sharon Zhou · Martin Gauch · Jonathan Wilson · Joyjit Chatterjee · Shamindra Shrotriya · Dimitri Papadimitriou · Christian Schön · Valentina Zantedeschi · Gabriella Baasch · Willem Waegeman · Gautier Cosne · Dara Farrell · Brendan Lucier · Letif Mones · Caleb Robinson · Tafara Chitsiga · Victor Kristof · Hari Prasanna Das · Yimeng Min · Alexandra Puchko · Alexandra Luccioni · Kyle Story · Jason Hickey · Yue Hu · Björn Lütjens · Zhecheng Wang · Renzhi Jing · Genevieve Flaspohler · Jingfan Wang · Saumya Sinha · Qinghu Tang · Armi Tiihonen · Ruben Glatt · Muge Komurcu · Jan Drgona · Juan Gomez-Romero · Ashish Kapoor · Dylan J Fitzpatrick · Alireza Rezvanifar · Adrian Albert · Olya (Olga) Irzak · Kara Lamb · Ankur Mahesh · Kiwan Maeng · Frederik Kratzert · Sorelle Friedler · Niccolo Dalmasso · Alex Robson · Lindiwe Malobola · Lucas Maystre · Yu-wen Lin · Surya Karthik Mukkavili · Brian Hutchinson · Alexandre Lacoste · Yanbing Wang · Zhengcheng Wang · Yinda Zhang · Victoria Preston · Jacob Pettit · Draguna Vrabie · Miguel Molina-Solana · Tonio Buonassisi · Andrew Annex · Tunai P Marques · Catalin Voss · Johannes Rausch · Max Evans -
2019 Poster: Characterizing Bias in Classifiers using Generative Models »
Daniel McDuff · Shuang Ma · Yale Song · Ashish Kapoor -
2019 Poster: Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting »
Aditya Grover · Jiaming Song · Ashish Kapoor · Kenneth Tran · Alekh Agarwal · Eric Horvitz · Stefano Ermon -
2016 Poster: Quantum Perceptron Models »
Ashish Kapoor · Nathan Wiebe · Krysta Svore -
2015 : Machine Learning as Rotations (Quantum Deep Learning) »
Ashish Kapoor -
2012 Poster: Multilabel Classification using Bayesian Compressed Sensing »
Ashish Kapoor · Raajay Viswanathan · Prateek Jain -
2009 Workshop: Analysis and Design of Algorithms for Interactive Machine Learning »
Sumit Basu · Ashish Kapoor -
2009 Poster: Breaking Boundaries Between Induction Time and Diagnosis Time Active Information Acquisition »
Ashish Kapoor · Eric Horvitz