Timezone: »
While agents trained by Reinforcement Learning (RL) can solve increasingly challenging tasks directly from visual observations, generalizing learned skills to novel environments remains very challenging. Extensive use of data augmentation is a promising technique for improving generalization in RL, but it is often found to decrease sample efficiency and can even lead to divergence. In this paper, we investigate causes of instability when using data augmentation in common off-policy RL algorithms. We identify two problems, both rooted in high-variance Q-targets. Based on our findings, we propose a simple yet effective technique for stabilizing this class of algorithms under augmentation. We perform extensive empirical evaluation of image-based RL using both ConvNets and Vision Transformers (ViT) on a family of benchmarks based on DeepMind Control Suite, as well as in robotic manipulation tasks. Our method greatly improves stability and sample efficiency of ConvNets under augmentation, and achieves generalization results competitive with state-of-the-art methods for image-based RL in environments with unseen visuals. We further show that our method scales to RL with ViT-based architectures, and that data augmentation may be especially important in this setting.
Author Information
Nicklas Hansen (UC San Diego)
Hao Su (Stanford)
Xiaolong Wang (UC San Diego)
More from the Same Authors
-
2021 : ManiSkill: Generalizable Manipulation Skill Benchmark with Large-Scale Demonstrations »
Tongzhou Mu · Zhan Ling · Fanbo Xiang · Derek Yang · Xuanlin Li · Stone Tao · Zhiao Huang · Zhiwei Jia · Hao Su -
2021 : From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation »
Yuzhe Qin · Hao Su · Xiaolong Wang -
2021 : Vision-Guided Quadrupedal Locomotion in the Wild with Multi-Modal Delay Randomization »
Minghao Zhang · Ruihan Yang · Yuzhe Qin · Xiaolong Wang -
2021 : Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers »
Ruihan Yang · Minghao Zhang · Nicklas Hansen · Huazhe Xu · Xiaolong Wang -
2021 : Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation »
Rishabh Jangir · Nicklas Hansen · Xiaolong Wang -
2021 : Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers »
Ruihan Yang · Minghao Zhang · Nicklas Hansen · Huazhe Xu · Xiaolong Wang -
2021 : Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers »
Ruihan Yang · Minghao Zhang · Nicklas Hansen · Huazhe Xu · Xiaolong Wang -
2021 : Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation »
Rishabh Jangir · Nicklas Hansen · Mohit Jain · Xiaolong Wang -
2022 : On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning »
yifan xu · Nicklas Hansen · Zirui Wang · Yung-Chieh Chan · Hao Su · Zhuowen Tu -
2022 : Visual Reinforcement Learning with Self-Supervised 3D Representations »
Yanjie Ze · Nicklas Hansen · Yinbo Chen · Mohit Jain · Xiaolong Wang -
2022 : MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations »
Nicklas Hansen · Yixin Lin · Hao Su · Xiaolong Wang · Vikash Kumar · Aravind Rajeswaran -
2022 : Graph Inverse Reinforcement Learning from Diverse Videos »
Sateesh Kumar · Jonathan Zamora · Nicklas Hansen · Rishabh Jangir · Xiaolong Wang -
2022 : On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning »
yifan xu · Nicklas Hansen · Zirui Wang · Yung-Chieh Chan · Hao Su · Zhuowen Tu -
2021 Poster: Multi-Person 3D Motion Prediction with Multi-Range Transformers »
Jiashun Wang · Huazhe Xu · Medhini Narasimhan · Xiaolong Wang -
2021 Poster: NovelD: A Simple yet Effective Exploration Criterion »
Tianjun Zhang · Huazhe Xu · Xiaolong Wang · Yi Wu · Kurt Keutzer · Joseph Gonzalez · Yuandong Tian -
2021 Poster: Particle Cloud Generation with Message Passing Generative Adversarial Networks »
Raghav Kansal · Javier Duarte · Hao Su · Breno Orzari · Thiago Tomei · Maurizio Pierini · Mary Touranakou · jean-roch vlimant · Dimitrios Gunopulos -
2021 Poster: Test-Time Personalization with a Transformer for Human Pose Estimation »
Yizhuo Li · Miao Hao · Zonglin Di · Nitesh Bharadwaj Gundavarapu · Xiaolong Wang -
2017 Poster: PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space »
Charles Ruizhongtai Qi · Li Yi · Hao Su · Leonidas Guibas -
2014 Poster: Deep Joint Task Learning for Generic Object Extraction »
Xiaolong Wang · Liliang Zhang · Liang Lin · Zhujin Liang · Wangmeng Zuo -
2012 Poster: Dynamical And-Or Graph Learning for Object Shape Modeling and Detection »
Xiaolong Wang · Liang Lin