Timezone: »
We consider the imitation learning problem of learning a policy in a Markov Decision Process (MDP) setting where the reward function is not given, but demonstrations from experts are available. Although the goal of imitation learning is to learn a policy that produces behaviors nearly as good as the experts’ for a desired task, assumptions of consistent optimality for demonstrated behaviors are often violated in practice. Finding a policy that is distributionally robust against noisy demonstrations based on an adversarial construction potentially solves this problem by avoiding optimistic generalizations of the demonstrated data. This paper studies Distributionally Robust Imitation Learning (DRoIL) and establishes a close connection between DRoIL and Maximum Entropy Inverse Reinforcement Learning. We show that DRoIL can be seen as a framework that maximizes a generalized concept of entropy. We develop a novel approach to transform the objective function into a convex optimization problem over a polynomial number of variables for a class of loss functions that are additive over state and action spaces. Our approach lets us optimize both stationary and non-stationary policies and, unlike prevalent previous methods, it does not require repeatedly solving an inner reinforcement learning problem. We experimentally show the significant benefits of DRoIL’s new optimization method on synthetic data and a highway driving environment.
Author Information
Mohammad Ali Bashiri (University of Illinois at Chicago)
Brian Ziebart (University of Illinois at Chicago)
Xinhua Zhang (University of Illinois at Chicago (UIC))
More from the Same Authors
-
2022 : Poisoning Generative Models to Promote Catastrophic Forgetting »
Siteng Kang · Xinhua Zhang -
2022 : Continual Poisoning of Generative Models to Promote Catastrophic Forgetting »
Siteng Kang · Xinhua Zhang -
2022 Poster: Moment Distributionally Robust Tree Structured Prediction »
Yeshu Li · Danyal Saeed · Xinhua Zhang · Brian Ziebart · Kevin Gimpel -
2022 Poster: Certifying Robust Graph Classification under Orthogonal Gromov-Wasserstein Threats »
Hongwei Jin · Zishun Yu · Xinhua Zhang -
2021 Poster: Implicit Task-Driven Probability Discrepancy Measure for Unsupervised Domain Adaptation »
Mao Li · Kaiqi Jiang · Xinhua Zhang -
2020 Poster: Certified Robustness of Graph Convolution Networks for Graph Classification under Topological Attacks »
Hongwei Jin · Zhan Shi · Venkata Jaya Shankar Ashish Peruri · Xinhua Zhang -
2020 Spotlight: Certified Robustness of Graph Convolution Networks for Graph Classification under Topological Attacks »
Hongwei Jin · Zhan Shi · Venkata Jaya Shankar Ashish Peruri · Xinhua Zhang -
2020 Poster: Proximal Mapping for Deep Regularization »
Mao Li · Yingyi Ma · Xinhua Zhang -
2020 Spotlight: Proximal Mapping for Deep Regularization »
Mao Li · Yingyi Ma · Xinhua Zhang -
2018 Poster: Distributionally Robust Graphical Models »
Rizal Fathony · Ashkan Rezaei · Mohammad Ali Bashiri · Xinhua Zhang · Brian Ziebart -
2017 Poster: Decomposition-Invariant Conditional Gradient for General Polytopes with Line Search »
Mohammad Ali Bashiri · Xinhua Zhang -
2017 Poster: Bregman Divergence for Stochastic Variance Reduction: Saddle-Point and Adversarial Prediction »
Zhan Shi · Xinhua Zhang · Yaoliang Yu -
2017 Spotlight: Bregman Divergence for Stochastic Variance Reduction: Saddle-Point and Adversarial Prediction »
Zhan Shi · Xinhua Zhang · Yaoliang Yu -
2017 Poster: Adversarial Surrogate Losses for Ordinal Regression »
Rizal Fathony · Mohammad Ali Bashiri · Brian Ziebart -
2016 Poster: Convex Two-Layer Modeling with Latent Structure »
Vignesh Ganapathiraman · Xinhua Zhang · Yaoliang Yu · Junfeng Wen -
2014 Poster: Convex Deep Learning via Normalized Kernels »
Özlem Aslan · Xinhua Zhang · Dale Schuurmans -
2014 Poster: Robust Bayesian Max-Margin Clustering »
Changyou Chen · Jun Zhu · Xinhua Zhang -
2013 Poster: Learning with Invariance via Linear Functionals on Reproducing Kernel Hilbert Space »
Xinhua Zhang · Wee Sun Lee · Yee Whye Teh -
2013 Spotlight: Learning with Invariance via Linear Functionals on Reproducing Kernel Hilbert Space »
Xinhua Zhang · Wee Sun Lee · Yee Whye Teh -
2013 Poster: Convex Two-Layer Modeling »
Özlem Aslan · Hao Cheng · Xinhua Zhang · Dale Schuurmans -
2013 Spotlight: Convex Two-Layer Modeling »
Özlem Aslan · Hao Cheng · Xinhua Zhang · Dale Schuurmans -
2013 Poster: Polar Operators for Structured Sparse Estimation »
Xinhua Zhang · Yao-Liang Yu · Dale Schuurmans -
2012 Poster: Convex Multi-view Subspace Learning »
Martha White · Yao-Liang Yu · Xinhua Zhang · Dale Schuurmans -
2012 Poster: Accelerated Training for Matrix-norm Regularization: A Boosting Approach »
Xinhua Zhang · Yao-Liang Yu · Dale Schuurmans -
2010 Poster: Lower Bounds on Rate of Convergence of Cutting Plane Methods »
Xinhua Zhang · Ankan Saha · S.V.N. Vishwanathan -
2008 Poster: Kernel Measures of Independence for non-iid Data »
Xinhua Zhang · Le Song · Arthur Gretton · Alexander Smola -
2008 Spotlight: Kernel Measures of Independence for non-iid Data »
Xinhua Zhang · Le Song · Arthur Gretton · Alexander Smola -
2006 Poster: Hyperparameter Learning for Graph Based Semi-supervised Learning Algorithms »
Xinhua Zhang · Wee Sun Lee