Timezone: »
Existing inverse reinforcement learning (IRL) algorithms have assumed each expert’s demonstrated trajectory to be produced by only a single reward function. This paper presents a novel generalization of the IRL problem that allows each trajectory to be generated by multiple locally consistent reward functions, hence catering to more realistic and complex experts’ behaviors. Solving our generalized IRL problem thus involves not only learning these reward functions but also the stochastic transitions between them at any state (including unvisited states). By representing our IRL problem with a probabilistic graphical model, an expectation-maximization (EM) algorithm can be devised to iteratively learn the different reward functions and the stochastic transitions between them in order to jointly improve the likelihood of the expert’s demonstrated trajectories. As a result, the most likely partition of a trajectory into segments that are generated from different locally consistent reward functions selected by EM can be derived. Empirical evaluation on synthetic and real-world datasets shows that our IRL algorithm outperforms the state-of-the-art EM clustering with maximum likelihood IRL, which is, interestingly, a reduced variant of our approach.
Author Information
Quoc Phong Nguyen (National University of Singapore)
Bryan Kian Hsiang Low (National University of Singapore)
Patrick Jaillet (Massachusetts Institute of Technology)
More from the Same Authors
-
2023 Poster: Exploiting Correlated Auxiliary Feedback in Parameterized Bandits »
Arun Verma · Zhongxiang Dai · YAO SHU · Bryan Kian Hsiang Low -
2023 Poster: Memory-Constrained Algorithms for Convex Optimization »
Moise Blanchard · Junhui Zhang · Patrick Jaillet -
2023 Poster: Equitable Model Valuation with Black-box Access »
Xinyi Xu · Thanh Lam · Chuan Sheng Foo · Bryan Kian Hsiang Low -
2023 Poster: Quantum Bayesian Optimization »
Zhongxiang Dai · Gregory Kang Ruey Lau · Arun Verma · YAO SHU · Bryan Kian Hsiang Low · Patrick Jaillet -
2023 Poster: Batch Bayesian Optimization For Replicable Experimental Design »
Zhongxiang Dai · Quoc Phong Nguyen · Sebastian Tay · Daisuke Urano · Richalynn Leong · Bryan Kian Hsiang Low · Patrick Jaillet -
2023 Poster: Incentives in Private Collaborative Machine Learning »
Rachael Sim · Yehong Zhang · Nghia Hoang · Xinyi Xu · Bryan Kian Hsiang Low · Patrick Jaillet -
2023 Poster: Bayesian Optimization with Cost-varying Variable Subsets »
Sebastian Tay · Chuan Sheng Foo · Daisuke Urano · Richalynn Leong · Bryan Kian Hsiang Low -
2022 Poster: Effective Dimension in Bandit Problems under Censorship »
Gauthier Guinet · Saurabh Amin · Patrick Jaillet -
2022 Poster: Trade-off between Payoff and Model Rewards in Shapley-Fair Collaborative Machine Learning »
Quoc Phong Nguyen · Bryan Kian Hsiang Low · Patrick Jaillet -
2022 Poster: Sample-Then-Optimize Batch Neural Thompson Sampling »
Zhongxiang Dai · YAO SHU · Bryan Kian Hsiang Low · Patrick Jaillet -
2022 Poster: Unifying and Boosting Gradient-Based Training-Free Neural Architecture Search »
YAO SHU · Zhongxiang Dai · Zhaoxuan Wu · Bryan Kian Hsiang Low -
2021 Workshop: New Frontiers in Federated Learning: Privacy, Fairness, Robustness, Personalization and Data Ownership »
Nghia Hoang · Lam Nguyen · Pin-Yu Chen · Tsui-Wei Weng · Sara Magliacane · Bryan Kian Hsiang Low · Anoop Deoras -
2021 Poster: Differentially Private Federated Bayesian Optimization with Distributed Exploration »
Zhongxiang Dai · Bryan Kian Hsiang Low · Patrick Jaillet -
2021 Poster: Gradient Driven Rewards to Guarantee Fairness in Collaborative Machine Learning »
Xinyi Xu · Lingjuan Lyu · Xingjun Ma · Chenglin Miao · Chuan Sheng Foo · Bryan Kian Hsiang Low -
2021 Poster: Fault-Tolerant Federated Reinforcement Learning with Theoretical Guarantee »
Xiaofeng Fan · Yining Ma · Zhongxiang Dai · Wei Jing · Cheston Tan · Bryan Kian Hsiang Low -
2021 Poster: Optimizing Conditional Value-At-Risk of Black-Box Functions »
Quoc Phong Nguyen · Zhongxiang Dai · Bryan Kian Hsiang Low · Patrick Jaillet -
2021 Poster: Validation Free and Replication Robust Volume-based Data Valuation »
Xinyi Xu · Zhaoxuan Wu · Chuan Sheng Foo · Bryan Kian Hsiang Low -
2020 Poster: Variational Bayesian Unlearning »
Quoc Phong Nguyen · Bryan Kian Hsiang Low · Patrick Jaillet -
2020 Poster: Federated Bayesian Optimization via Thompson Sampling »
Zhongxiang Dai · Bryan Kian Hsiang Low · Patrick Jaillet -
2020 Poster: Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization »
Sreejith Balakrishnan · Quoc Phong Nguyen · Bryan Kian Hsiang Low · Harold Soh -
2020 Poster: No-regret Learning in Price Competitions under Consumer Reference Effects »
Negin Golrezaei · Patrick Jaillet · Jason Cheuk Nam Liang -
2019 Poster: Implicit Posterior Variational Inference for Deep Gaussian Processes »
Haibin YU · Yizhou Chen · Bryan Kian Hsiang Low · Patrick Jaillet · Zhongxiang Dai -
2019 Spotlight: Implicit Posterior Variational Inference for Deep Gaussian Processes »
Haibin YU · Yizhou Chen · Bryan Kian Hsiang Low · Patrick Jaillet · Zhongxiang Dai -
2017 : Poster Session 2 »
Farhan Shafiq · Antonio Tomas Nevado Vilchez · Takato Yamada · Sakyasingha Dasgupta · Robin Geyer · Moin Nabi · Crefeda Rodrigues · Edoardo Manino · Alexantrou Serb · Miguel A. Carreira-Perpinan · Kar Wai Lim · Bryan Kian Hsiang Low · Rohit Pandey · Marie C White · Pavel Pidlypenskyi · Xue Wang · Christine Kaeser-Chen · Michael Zhu · Suyog Gupta · Sam Leroux -
2017 : Aligned AI Poster Session »
Amanda Askell · Rafal Muszynski · William Wang · Yaodong Yang · Quoc Nguyen · Bryan Kian Hsiang Low · Patrick Jaillet · Candice Schumann · Anqi Liu · Peter Eckersley · Angelina Wang · William Saunders -
2017 Poster: Real-Time Bidding with Side Information »
arthur flajolet · Patrick Jaillet -
2017 Poster: Online Learning with a Hint »
Ofer Dekel · arthur flajolet · Nika Haghtalab · Patrick Jaillet -
2013 Poster: Regret based Robust Solutions for Uncertain Markov Decision Processes »
Asrar Ahmed · Pradeep Varakantham · Yossiri Adulyasak · Patrick Jaillet