Timezone: »
Poster
Offline Imitation Learning with a Misspecified Simulator
Shengyi Jiang · Jingcheng Pang · Yang Yu
In real-world decision-making tasks, learning an optimal policy without a trial-and-error process is an appealing challenge. When expert demonstrations are available, imitation learning that mimics expert actions can learn a good policy efficiently. Learning in simulators is another commonly adopted approach to avoid real-world trials-and-errors. However, neither sufficient expert demonstrations nor high-fidelity simulators are easy to obtain. In this work, we investigate policy learning in the condition of a few expert demonstrations and a simulator with misspecified dynamics. Under a mild assumption that local states shall still be partially aligned under a dynamics mismatch, we propose imitation learning with horizon-adaptive inverse dynamics (HIDIL) that matches the simulator states with expert states in a $H$-step horizon and accurately recovers actions based on inverse dynamics policies. In the real environment, HIDIL can effectively derive adapted actions from the matched states. Experiments are conducted in four MuJoCo locomotion environments with modified friction, gravity, and density configurations. Experiment results show that HIDIL achieves significant improvement in terms of performance and stability in all of the real environments, compared with imitation learning methods and transferring methods in reinforcement learning.
Author Information
Shengyi Jiang (Nanjing University)
Jingcheng Pang (Nanjing University)
Yang Yu (Nanjing University)
More from the Same Authors
-
2022 Poster: Efficient Multi-agent Communication via Self-supervised Information Aggregation »
Cong Guan · Feng Chen · Lei Yuan · Chenghe Wang · Hao Yin · Zongzhang Zhang · Yang Yu -
2022 : Multi-Agent Policy Transfer via Task Relationship Modeling »
Rong-Jun Qin · Feng Chen · Tonghan Wang · Lei Yuan · Xiaoran Wu · Yipeng Kang · Zongzhang Zhang · Chongjie Zhang · Yang Yu -
2023 Poster: Imitation Learning from Imperfection: Theoretical Justifications and Algorithms »
Ziniu Li · Tian Xu · Zeyu Qin · Yang Yu · Zhi-Quan Luo -
2023 Poster: Adversarial Counterfactual Environment Model Learning »
Xiong-Hui Chen · Yang Yu · Zhengmao Zhu · ZhiHua Yu · Chen Zhenjun · Chenghe Wang · Yinan Wu · Rong-Jun Qin · Hongqiu Wu · Ruijin Ding · Huang Fangsheng -
2023 Poster: Natural Language-conditioned Reinforcement Learning with Task-related Language Development and Translation »
Jingcheng Pang · Xin-Yu Yang · Si-Hang Yang · Xiong-Hui Chen · Yang Yu -
2023 Poster: Learning World Models with Identifiable Factorization »
Yuren Liu · Biwei Huang · Zhengmao Zhu · Honglong Tian · Mingming Gong · Yang Yu · Kun Zhang -
2022 Spotlight: Lightning Talks 5A-3 »
Minting Pan · Xiang Chen · Wenhan Huang · Can Chang · Zhecheng Yuan · Jianzhun Shao · Yushi Cao · Peihao Chen · Ke Xue · Zhengrong Xue · Zhiqiang Lou · Xiangming Zhu · Lei Li · Zhiming Li · Kai Li · Jiacheng Xu · Dongyu Ji · Ni Mu · Kun Shao · Tianpei Yang · Kunyang Lin · Ningyu Zhang · Yunbo Wang · Lei Yuan · Bo Yuan · Hongchang Zhang · Jiajun Wu · Tianze Zhou · Xueqian Wang · Ling Pan · Yuhang Jiang · Xiaokang Yang · Xiaozhuan Liang · Hao Zhang · Weiwen Hu · Miqing Li · YAN ZHENG · Matthew Taylor · Huazhe Xu · Shumin Deng · Chao Qian · YI WU · Shuncheng He · Wenbing Huang · Chuanqi Tan · Zongzhang Zhang · Yang Gao · Jun Luo · Yi Li · Xiangyang Ji · Thomas Li · Mingkui Tan · Fei Huang · Yang Yu · Huazhe Xu · Dongge Wang · Jianye Hao · Chuang Gan · Yang Liu · Luo Si · Hangyu Mao · Huajun Chen · Jianye Hao · Jun Wang · Xiaotie Deng -
2022 Spotlight: Multi-agent Dynamic Algorithm Configuration »
Ke Xue · Jiacheng Xu · Lei Yuan · Miqing Li · Chao Qian · Zongzhang Zhang · Yang Yu -
2022 Spotlight: Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning »
Chenyang Wu · Tianci Li · Zongzhang Zhang · Yang Yu -
2022 Spotlight: Lightning Talks 4B-1 »
Alexandra Senderovich · Zhijie Deng · Navid Ansari · Xuefei Ning · Yasmin Salehi · Xiang Huang · Chenyang Wu · Kelsey Allen · Jiaqi Han · Nikita Balagansky · Tatiana Lopez-Guevara · Tianci Li · Zhanhong Ye · Zixuan Zhou · Feng Zhou · Ekaterina Bulatova · Daniil Gavrilov · Wenbing Huang · Dennis Giannacopoulos · Hans-peter Seidel · Anton Obukhov · Kimberly Stachenfeld · Hongsheng Liu · Jun Zhu · Junbo Zhao · Hengbo Ma · Nima Vahidi Ferdowsi · Zongzhang Zhang · Vahid Babaei · Jiachen Li · Alvaro Sanchez Gonzalez · Yang Yu · Shi Ji · Maxim Rakhuba · Tianchen Zhao · Yiping Deng · Peter Battaglia · Josh Tenenbaum · Zidong Wang · Chuang Gan · Changcheng Tang · Jessica Hamrick · Kang Yang · Tobias Pfaff · Yang Li · Shuang Liang · Min Wang · Huazhong Yang · Haotian CHU · Yu Wang · Fan Yu · Bei Hua · Lei Chen · Bin Dong -
2022 Poster: NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning »
Rong-Jun Qin · Xingyuan Zhang · Songyi Gao · Xiong-Hui Chen · Zewen Li · Weinan Zhang · Yang Yu -
2022 Poster: Multi-agent Dynamic Algorithm Configuration »
Ke Xue · Jiacheng Xu · Lei Yuan · Miqing Li · Chao Qian · Zongzhang Zhang · Yang Yu -
2022 Poster: Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning »
Chenyang Wu · Tianci Li · Zongzhang Zhang · Yang Yu -
2021 : More Efficient Adversarial Imitation Learning Algorithms With Known and Unknown Transitions »
Tian Xu · Ziniu Li · Yang Yu -
2021 Poster: Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning »
Xiong-Hui Chen · Shengyi Jiang · Feng Xu · Zongzhang Zhang · Yang Yu -
2021 Poster: Regret Minimization Experience Replay in Off-Policy Reinforcement Learning »
Xu-Hui Liu · Zhenghai Xue · Jingcheng Pang · Shengyi Jiang · Feng Xu · Yang Yu -
2020 Poster: Error Bounds of Imitating Policies and Environments »
Tian Xu · Ziniu Li · Yang Yu -
2019 Poster: Bridging Machine Learning and Logical Reasoning by Abductive Learning »
Wang-Zhou Dai · Qiuling Xu · Yang Yu · Zhi-Hua Zhou -
2017 Poster: Subset Selection under Noise »
Chao Qian · Jing-Cheng Shi · Yang Yu · Ke Tang · Zhi-Hua Zhou