Timezone: »
The problem of inverse reinforcement learning (IRL) is relevant to a variety of tasks including value alignment and robot learning from demonstration. Despite significant algorithmic contributions in recent years, IRL remains an ill-posed problem at its core; multiple reward functions coincide with the observed behavior and the actual reward function is not identifiable without prior knowledge or supplementary information. This paper presents an IRL framework called Bayesian optimization-IRL (BO-IRL) which identifies multiple solutions that are consistent with the expert demonstrations by efficiently exploring the reward function space. BO-IRL achieves this by utilizing Bayesian Optimization along with our newly proposed kernel that (a) projects the parameters of policy invariant reward functions to a single point in a latent space and (b) ensures nearby points in the latent space correspond to reward functions yielding similar likelihoods. This projection allows the use of standard stationary kernels in the latent space to capture the correlations present across the reward function space. Empirical results on synthetic and real-world environments (model-free and model-based) show that BO-IRL discovers multiple reward functions while minimizing the number of expensive exact policy optimizations.
Author Information
Sreejith Balakrishnan (National University of Singapore)
Quoc Phong Nguyen (National University of Singapore)
Bryan Kian Hsiang Low (National University of Singapore)
Harold Soh (National University Singapore)
More from the Same Authors
-
2022 Poster: Trade-off between Payoff and Model Rewards in Shapley-Fair Collaborative Machine Learning »
Quoc Phong Nguyen · Bryan Kian Hsiang Low · Patrick Jaillet -
2022 Poster: Sample-Then-Optimize Batch Neural Thompson Sampling »
Zhongxiang Dai · YAO SHU · Bryan Kian Hsiang Low · Patrick Jaillet -
2022 Poster: Unifying and Boosting Gradient-Based Training-Free Neural Architecture Search »
YAO SHU · Zhongxiang Dai · Zhaoxuan Wu · Bryan Kian Hsiang Low -
2021 Workshop: New Frontiers in Federated Learning: Privacy, Fairness, Robustness, Personalization and Data Ownership »
Nghia Hoang · Lam Nguyen · Pin-Yu Chen · Tsui-Wei Weng · Sara Magliacane · Bryan Kian Hsiang Low · Anoop Deoras -
2021 Poster: Differentially Private Federated Bayesian Optimization with Distributed Exploration »
Zhongxiang Dai · Bryan Kian Hsiang Low · Patrick Jaillet -
2021 Poster: Gradient Driven Rewards to Guarantee Fairness in Collaborative Machine Learning »
Xinyi Xu · Lingjuan Lyu · Xingjun Ma · Chenglin Miao · Chuan Sheng Foo · Bryan Kian Hsiang Low -
2021 Poster: Fault-Tolerant Federated Reinforcement Learning with Theoretical Guarantee »
Xiaofeng Fan · Yining Ma · Zhongxiang Dai · Wei Jing · Cheston Tan · Bryan Kian Hsiang Low -
2021 Poster: Optimizing Conditional Value-At-Risk of Black-Box Functions »
Quoc Phong Nguyen · Zhongxiang Dai · Bryan Kian Hsiang Low · Patrick Jaillet -
2021 Poster: Validation Free and Replication Robust Volume-based Data Valuation »
Xinyi Xu · Zhaoxuan Wu · Chuan Sheng Foo · Bryan Kian Hsiang Low -
2020 Poster: Variational Bayesian Unlearning »
Quoc Phong Nguyen · Bryan Kian Hsiang Low · Patrick Jaillet -
2020 Poster: Federated Bayesian Optimization via Thompson Sampling »
Zhongxiang Dai · Bryan Kian Hsiang Low · Patrick Jaillet -
2019 Poster: Implicit Posterior Variational Inference for Deep Gaussian Processes »
Haibin YU · Yizhou Chen · Bryan Kian Hsiang Low · Patrick Jaillet · Zhongxiang Dai -
2019 Spotlight: Implicit Posterior Variational Inference for Deep Gaussian Processes »
Haibin YU · Yizhou Chen · Bryan Kian Hsiang Low · Patrick Jaillet · Zhongxiang Dai -
2017 : Poster Session 2 »
Farhan Shafiq · Antonio Tomas Nevado Vilchez · Takato Yamada · Sakyasingha Dasgupta · Robin Geyer · Moin Nabi · Crefeda Rodrigues · Edoardo Manino · Alexantrou Serb · Miguel A. Carreira-Perpinan · Kar Wai Lim · Bryan Kian Hsiang Low · Rohit Pandey · Marie C White · Pavel Pidlypenskyi · Xue Wang · Christine Kaeser-Chen · Michael Zhu · Suyog Gupta · Sam Leroux -
2017 : Aligned AI Poster Session »
Amanda Askell · Rafal Muszynski · William Wang · Yaodong Yang · Quoc Nguyen · Bryan Kian Hsiang Low · Patrick Jaillet · Candice Schumann · Anqi Liu · Peter Eckersley · Angelina Wang · William Saunders -
2015 Poster: Inverse Reinforcement Learning with Locally Consistent Reward Functions »
Quoc Phong Nguyen · Bryan Kian Hsiang Low · Patrick Jaillet