Timezone: »
We study the problem of policy parameterization for reinforcement learning (RL) with high-dimensional continuous action space. Our goal is to find a good way to parameterize the policy of continuous RL as a multi-modality distribution. To this end, we propose to treat the continuous RL policy as a generative model over the distribution of optimal trajectories. We use a diffusion process-like strategy to model the policy and derive a novel variational bound which is the optimization objective to learn the policy. To maximize the objective by gradient descent, we introduce the Reparameterized Policy Gradient Theorem. This theorem elegantly connects classical method REINFORCE and trajectory return optimization for computing the gradient of a policy. Moreover, our method enjoys strong exploration ability due to the multi-modality policy parameterization; notably, when a strong differentiable world model presents, our method also enjoys the fast convergence speed of trajectory optimization. We evaluate our method on numerical problems and manipulation tasks within a differentiable simulator. Qualitative results show its ability to capture the multi-modality distribution of optimal trajectories, and quantitative results show that it can avoid local optima and outperforms baseline approaches.
Author Information
Zhiao Huang (University of California San Diego)
Litian Liang (University of California, San Diego)
Zhan Ling (UC San Diego)
Xuanlin Li (University of California, San Diego)
Chuang Gan (UMass Amherst/ MIT-IBM Watson AI Lab)
Hao Su (UCSD)
More from the Same Authors
-
2021 : ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation »
Chuang Gan · Jeremy Schwartz · Seth Alter · Damian Mrowca · Martin Schrimpf · James Traer · Julian De Freitas · Jonas Kubilius · Abhishek Bhandwaldar · Nick Haber · Megumi Sano · Kuno Kim · Elias Wang · Michael Lingelbach · Aidan Curtis · Kevin Feigelis · Daniel Bear · Dan Gutfreund · David Cox · Antonio Torralba · James J DiCarlo · Josh Tenenbaum · Josh McDermott · Dan Yamins -
2021 : STAR: A Benchmark for Situated Reasoning in Real-World Videos »
Bo Wu · Shoubin Yu · Zhenfang Chen · Josh Tenenbaum · Chuang Gan -
2021 : ManiSkill: Generalizable Manipulation Skill Benchmark with Large-Scale Demonstrations »
Tongzhou Mu · Zhan Ling · Fanbo Xiang · Derek Yang · Xuanlin Li · Stone Tao · Zhiao Huang · Zhiwei Jia · Hao Su -
2021 : From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation »
Yuzhe Qin · Hao Su · Xiaolong Wang -
2022 Poster: Learning Physical Dynamics with Subequivariant Graph Neural Networks »
Jiaqi Han · Wenbing Huang · Hengbo Ma · Jiachen Li · Josh Tenenbaum · Chuang Gan -
2022 Poster: SNAKE: Shape-aware Neural 3D Keypoint Field »
Chengliang Zhong · Peixing You · Xiaoxue Chen · Hao Zhao · Fuchun Sun · Guyue Zhou · Xiaodong Mu · Chuang Gan · Wenbing Huang -
2022 : Planning with Large Language Models for Code Generation »
Shun Zhang · Zhenfang Chen · Yikang Shen · Mingyu Ding · Josh Tenenbaum · Chuang Gan -
2022 : On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning »
yifan xu · Nicklas Hansen · Zirui Wang · Yung-Chieh Chan · Hao Su · Zhuowen Tu -
2022 : Hyper-Decision Transformer for Efficient Online Policy Adaptation »
Mengdi Xu · Yuchen Lu · Yikang Shen · Shun Zhang · DING ZHAO · Chuang Gan -
2022 : Generalizable Point Cloud Reinforcement Learning for Sim-to-Real Dexterous Manipulation »
Yuzhe Qin · Binghao Huang · Zhao-Heng Yin · Hao Su · Xiaolong Wang -
2022 : Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization »
Stone Tao · Xiaochen Li · Tongzhou Mu · Zhiao Huang · Yuzhe Qin · Hao Su -
2022 : Multi-skill Mobile Manipulation for Object Rearrangement »
Jiayuan Gu · Devendra Singh Chaplot · Hao Su · Jitendra Malik -
2022 : MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations »
Nicklas Hansen · Yixin Lin · Hao Su · Xiaolong Wang · Vikash Kumar · Aravind Rajeswaran -
2022 : On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning »
yifan xu · Nicklas Hansen · Zirui Wang · Yung-Chieh Chan · Hao Su · Zhuowen Tu -
2022 Spotlight: Lightning Talks 6A-3 »
Junyu Xie · Chengliang Zhong · Ali Ayub · Sravanti Addepalli · Harsh Rangwani · Jiapeng Tang · Yuchen Rao · Zhiying Jiang · Yuqi Wang · Xingzhe He · Gene Chou · Ilya Chugunov · Samyak Jain · Yuntao Chen · Weidi Xie · Sumukh K Aithal · Carter Fendley · Lev Markhasin · Yiqin Dai · Peixing You · Bastian Wandt · Yinyu Nie · Helge Rhodin · Felix Heide · Ji Xin · Angela Dai · Andrew Zisserman · Bi Wang · Xiaoxue Chen · Mayank Mishra · ZHAO-XIANG ZHANG · Venkatesh Babu R · Justus Thies · Ming Li · Hao Zhao · Venkatesh Babu R · Jimmy Lin · Fuchun Sun · Matthias Niessner · Guyue Zhou · Xiaodong Mu · Chuang Gan · Wenbing Huang -
2022 Spotlight: SNAKE: Shape-aware Neural 3D Keypoint Field »
Chengliang Zhong · Peixing You · Xiaoxue Chen · Hao Zhao · Fuchun Sun · Guyue Zhou · Xiaodong Mu · Chuang Gan · Wenbing Huang -
2022 Spotlight: Lightning Talks 5A-3 »
Minting Pan · Xiang Chen · Wenhan Huang · Can Chang · Zhecheng Yuan · Jianzhun Shao · Yushi Cao · Peihao Chen · Ke Xue · Zhengrong Xue · Zhiqiang Lou · Xiangming Zhu · Lei Li · Zhiming Li · Kai Li · Jiacheng Xu · Dongyu Ji · Ni Mu · Kun Shao · Tianpei Yang · Kunyang Lin · Ningyu Zhang · Yunbo Wang · Lei Yuan · Bo Yuan · Hongchang Zhang · Jiajun Wu · Tianze Zhou · Xueqian Wang · Ling Pan · Yuhang Jiang · Xiaokang Yang · Xiaozhuan Liang · Hao Zhang · Weiwen Hu · Miqing Li · YAN ZHENG · Matthew Taylor · Huazhe Xu · Shumin Deng · Chao Qian · YI WU · Shuncheng He · Wenbing Huang · Chuanqi Tan · Zongzhang Zhang · Yang Gao · Jun Luo · Yi Li · Xiangyang Ji · Thomas Li · Mingkui Tan · Fei Huang · Yang Yu · Huazhe Xu · Dongge Wang · Jianye Hao · Chuang Gan · Yang Liu · Luo Si · Hangyu Mao · Huajun Chen · Jianye Hao · Jun Wang · Xiaotie Deng -
2022 Spotlight: Learning Active Camera for Multi-Object Navigation »
Peihao Chen · Dongyu Ji · Kunyang Lin · Weiwen Hu · Wenbing Huang · Thomas Li · Mingkui Tan · Chuang Gan -
2022 Spotlight: Lightning Talks 4B-3 »
Zicheng Zhang · Mancheng Meng · Antoine Guedon · Yue Wu · Wei Mao · Zaiyu Huang · Peihao Chen · Shizhe Chen · yongwei chen · Keqiang Sun · Yi Zhu · chen rui · Hanhui Li · Dongyu Ji · Ziyan Wu · miaomiao Liu · Pascal Monasse · Yu Deng · Shangzhe Wu · Pierre-Louis Guhur · Jiaolong Yang · Kunyang Lin · Makarand Tapaswi · Zhaoyang Huang · Terrence Chen · Jiabao Lei · Jianzhuang Liu · Vincent Lepetit · Zhenyu Xie · Richard I Hartley · Dinggang Shen · Xiaodan Liang · Runhao Zeng · Cordelia Schmid · Michael Kampffmeyer · Mathieu Salzmann · Ning Zhang · Fangyun Wei · Yabin Zhang · Fan Yang · Qifeng Chen · Wei Ke · Quan Wang · Thomas Li · qingling Cai · Kui Jia · Ivan Laptev · Mingkui Tan · Xin Tong · Hongsheng Li · Xiaodan Liang · Chuang Gan -
2022 Spotlight: Learning Physical Dynamics with Subequivariant Graph Neural Networks »
Jiaqi Han · Wenbing Huang · Hengbo Ma · Jiachen Li · Josh Tenenbaum · Chuang Gan -
2022 Spotlight: Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation »
Peihao Chen · Dongyu Ji · Kunyang Lin · Runhao Zeng · Thomas Li · Mingkui Tan · Chuang Gan -
2022 Spotlight: Lightning Talks 4B-1 »
Alexandra Senderovich · Zhijie Deng · Navid Ansari · Xuefei Ning · Yasmin Salehi · Xiang Huang · Chenyang Wu · Kelsey Allen · Jiaqi Han · Nikita Balagansky · Tatiana Lopez-Guevara · Tianci Li · Zhanhong Ye · Zixuan Zhou · Feng Zhou · Ekaterina Bulatova · Daniil Gavrilov · Wenbing Huang · Dennis Giannacopoulos · Hans-peter Seidel · Anton Obukhov · Kimberly Stachenfeld · Hongsheng Liu · Jun Zhu · Junbo Zhao · Hengbo Ma · Nima Vahidi Ferdowsi · Zongzhang Zhang · Vahid Babaei · Jiachen Li · Alvaro Sanchez Gonzalez · Yang Yu · Shi Ji · Maxim Rakhuba · Tianchen Zhao · Yiping Deng · Peter Battaglia · Josh Tenenbaum · Zidong Wang · Chuang Gan · Changcheng Tang · Jessica Hamrick · Kang Yang · Tobias Pfaff · Yang Li · Shuang Liang · Min Wang · Huazhong Yang · Haotian CHU · Yu Wang · Fan Yu · Bei Hua · Lei Chen · Bin Dong -
2022 Poster: 3D Concept Grounding on Neural Fields »
Yining Hong · Yilun Du · Chunru Lin · Josh Tenenbaum · Chuang Gan -
2022 Poster: Learning Active Camera for Multi-Object Navigation »
Peihao Chen · Dongyu Ji · Kunyang Lin · Weiwen Hu · Wenbing Huang · Thomas Li · Mingkui Tan · Chuang Gan -
2022 Poster: Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation »
Peihao Chen · Dongyu Ji · Kunyang Lin · Runhao Zeng · Thomas Li · Mingkui Tan · Chuang Gan -
2022 Poster: Learning Neural Acoustic Fields »
Andrew Luo · Yilun Du · Michael Tarr · Josh Tenenbaum · Antonio Torralba · Chuang Gan -
2022 Poster: On-Device Training Under 256KB Memory »
Ji Lin · Ligeng Zhu · Wei-Ming Chen · Wei-Chen Wang · Chuang Gan · Song Han -
2021 Poster: Memory-efficient Patch-based Inference for Tiny Deep Learning »
Ji Lin · Wei-Ming Chen · Han Cai · Chuang Gan · Song Han -
2021 Poster: Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language »
Mingyu Ding · Zhenfang Chen · Tao Du · Ping Luo · Josh Tenenbaum · Chuang Gan -
2021 Poster: Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation »
Nicklas Hansen · Hao Su · Xiaolong Wang -
2021 Poster: PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning »
Yining Hong · Li Yi · Josh Tenenbaum · Antonio Torralba · Chuang Gan -
2021 Poster: Particle Cloud Generation with Message Passing Generative Adversarial Networks »
Raghav Kansal · Javier Duarte · Hao Su · Breno Orzari · Thiago Tomei · Maurizio Pierini · Mary Touranakou · jean-roch vlimant · Dimitrios Gunopulos -
2021 Poster: When does Contrastive Learning Preserve Adversarial Robustness from Pretraining to Finetuning? »
Lijie Fan · Sijia Liu · Pin-Yu Chen · Gaoyuan Zhang · Chuang Gan -
2021 : ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation »
Chuang Gan · Jeremy Schwartz · Seth Alter · Damian Mrowca · Martin Schrimpf · James Traer · Julian De Freitas · Jonas Kubilius · Abhishek Bhandwaldar · Nick Haber · Megumi Sano · Kuno Kim · Elias Wang · Michael Lingelbach · Aidan Curtis · Kevin Feigelis · Daniel Bear · Dan Gutfreund · David Cox · Antonio Torralba · James J DiCarlo · Josh Tenenbaum · Josh McDermott · Dan Yamins -
2020 Poster: MCUNet: Tiny Deep Learning on IoT Devices »
Ji Lin · Wei-Ming Chen · Yujun Lin · john cohn · Chuang Gan · Song Han -
2020 Poster: Towards Scale-Invariant Graph-related Problem Solving by Iterative Homogeneous GNNs »
Hao Tang · Zhiao Huang · Jiayuan Gu · Bao-Liang Lu · Hao Su -
2020 Spotlight: MCUNet: Tiny Deep Learning on IoT Devices »
Ji Lin · Wei-Ming Chen · Yujun Lin · john cohn · Chuang Gan · Song Han -
2020 Poster: TinyTL: Reduce Memory, Not Parameters for Efficient On-Device Learning »
Han Cai · Chuang Gan · Ligeng Zhu · Song Han -
2020 Poster: Multi-task Batch Reinforcement Learning with Metric Learning »
Jiachen Li · Quan Vuong · Shuang Liu · Minghua Liu · Kamil Ciosek · Henrik Christensen · Hao Su -
2020 Poster: Refactoring Policy for Compositional Generalizability using Self-Supervised Object Proposals »
Tongzhou Mu · Jiayuan Gu · Zhiwei Jia · Hao Tang · Hao Su -
2020 : Neurosymbolic Visual Reasoning »
Chuang Gan -
2019 Poster: Cross-channel Communication Networks »
Jianwei Yang · Zhile Ren · Chuang Gan · Hongyuan Zhu · Devi Parikh -
2019 Poster: Visual Concept-Metaconcept Learning »
Chi Han · Jiayuan Mao · Chuang Gan · Josh Tenenbaum · Jiajun Wu -
2019 Poster: Mapping State Space using Landmarks for Universal Goal Reaching »
Zhiao Huang · Fangchen Liu · Hao Su -
2019 Poster: Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement »
Chao Yang · Xiaojian Ma · Wenbing Huang · Fuchun Sun · Huaping Liu · Junzhou Huang · Chuang Gan -
2019 Spotlight: Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement »
Chao Yang · Xiaojian Ma · Wenbing Huang · Fuchun Sun · Huaping Liu · Junzhou Huang · Chuang Gan -
2018 Poster: Deep Functional Dictionaries: Learning Consistent Semantic Structures on 3D Models from Functions »
Minhyuk Sung · Hao Su · Ronald Yu · Leonidas Guibas -
2018 Poster: Weakly Supervised Dense Event Captioning in Videos »
Xin Wang · Wenbing Huang · Chuang Gan · Jingdong Wang · Wenwu Zhu · Junzhou Huang -
2018 Poster: Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding »
Kexin Yi · Jiajun Wu · Chuang Gan · Antonio Torralba · Pushmeet Kohli · Josh Tenenbaum -
2018 Spotlight: Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding »
Kexin Yi · Jiajun Wu · Chuang Gan · Antonio Torralba · Pushmeet Kohli · Josh Tenenbaum -
2017 Poster: PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space »
Charles Ruizhongtai Qi · Li Yi · Hao Su · Leonidas Guibas -
2016 Poster: FPNN: Field Probing Neural Networks for 3D Data »
Yangyan Li · Soeren Pirk · Hao Su · Charles R Qi · Leonidas Guibas -
2010 Poster: Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification »
Li-Jia Li · Hao Su · Eric Xing · Li Fei-Fei