Timezone: »
This paper introduces the QMDP-net, a neural network architecture for planning under partial observability. The QMDP-net combines the strengths of model-free learning and model-based planning. It is a recurrent policy network, but it represents a policy for a parameterized set of tasks by connecting a model with a planning algorithm that solves the model, thus embedding the solution structure of planning in a network learning architecture. The QMDP-net is fully differentiable and allows for end-to-end training. We train a QMDP-net on different tasks so that it can generalize to new ones in the parameterized task set and “transfer” to other similar tasks beyond the set. In preliminary experiments, QMDP-net showed strong performance on several robotic tasks in simulation. Interestingly, while QMDP-net encodes the QMDP algorithm, it sometimes outperforms the QMDP algorithm in the experiments, as a result of end-to-end learning.
Author Information
Peter Karkus (NUS)
David Hsu (National University of Singapore)
Wee Sun Lee (National University of Singapore)
Wee Sun Lee is a professor in the Department of Computer Science, National University of Singapore. He obtained his B.Eng from the University of Queensland in 1992 and his Ph.D. from the Australian National University in 1996. He has been a research fellow at the Australian Defence Force Academy, a fellow of the Singapore-MIT Alliance, and a visiting scientist at MIT. His research interests include machine learning, planning under uncertainty, and approximate inference. His works have won the Test of Time Award at Robotics: Science and Systems (RSS) 2021, the RoboCup Best Paper Award at International Conference on Intelligent Robots and Systems (IROS) 2015, the Google Best Student Paper Award, Uncertainty in AI (UAI) 2014 (as faculty co-author), as well as several competitions and challenges. He has been an area chair for machine learning and AI conferences such as the Neural Information Processing Systems (NeurIPS), the International Conference on Machine Learning (ICML), the AAAI Conference on Artificial Intelligence (AAAI), and the International Joint Conference on Artificial Intelligence (IJCAI). He was a program, conference and journal track co-chair for the Asian Conference on Machine Learning (ACML), and he is currently the co-chair of the steering committee of ACML.
More from the Same Authors
-
2022 : DiffStack: A Differentiable and Modular Control Stack for Autonomous Vehicles »
Peter Karkus · Boris Ivanovic · Shie Mannor · Marco Pavone -
2022 : Efficient Offline Policy Optimization with a Learned Model »
Zichen Liu · Siyi Li · Wee Sun Lee · Shuicheng Yan · Zhongwen Xu -
2023 Poster: What Truly Matters in Trajectory Prediction for Autonomous Driving? »
Haoran Wu · Tran Phong · Cunjun Yu · Panpan Cai · Sifa Zheng · David Hsu -
2023 Poster: Large Language Models as Commonsense Knowledge for Large-Scale Task Planning »
Zirui Zhao · Wee Sun Lee · David Hsu -
2022 Poster: Receding Horizon Inverse Reinforcement Learning »
Yiqing Xu · Wei Gao · David Hsu -
2021 : Part 4: Appendix: Proofs and Derivations »
Wee Sun Lee -
2021 : Part 3: Graph Neural Networks and Attention Networks »
Wee Sun Lee -
2021 : Part 2: Markov Decision Process »
Wee Sun Lee -
2021 Tutorial: Message Passing In Machine Learning »
Wee Sun Lee -
2021 : Part 1: Message Passing Overview and Probabilistic Graphical Models »
Wee Sun Lee -
2020 Poster: Factor Graph Neural Networks »
Zhen Zhang · Fan Wu · Wee Sun Lee -
2019 : Posters »
Colin Graber · Yuan-Ting Hu · Tiantian Fang · Jessica Hamrick · Giorgio Giannone · John Co-Reyes · Boyang Deng · Eric Crawford · Andrea Dittadi · Peter Karkus · Matthew Dirks · Rakshit Trivedi · Sunny Raj · Javier Felip Leon · Harris Chan · Jan Chorowski · Jeff Orchard · Aleksandar Stanić · Adam Kortylewski · Ben Zinberg · Chenghui Zhou · Wei Sun · Vikash Mansinghka · Chun-Liang Li · Marco Cusumano-Towner -
2018 Workshop: Reinforcement Learning under Partial Observability »
Joni Pajarinen · Chris Amato · Pascal Poupart · David Hsu -
2015 Poster: Adaptive Stochastic Optimization: From Sets to Paths »
Zhan Wei Lim · David Hsu · Wee Sun Lee -
2013 Poster: DESPOT: Online POMDP Planning with Regularization »
Adhiraj Somani · Nan Ye · David Hsu · Wee Sun Lee -
2013 Poster: Learning with Invariance via Linear Functionals on Reproducing Kernel Hilbert Space »
Xinhua Zhang · Wee Sun Lee · Yee Whye Teh -
2013 Spotlight: Learning with Invariance via Linear Functionals on Reproducing Kernel Hilbert Space »
Xinhua Zhang · Wee Sun Lee · Yee Whye Teh -
2013 Poster: Active Learning for Probabilistic Hypotheses Using the Maximum Gibbs Error Criterion »
Nguyen Viet Cuong · Wee Sun Lee · Nan Ye · Kian Ming Adam Chai · Hai Leong Chieu -
2011 Poster: Monte Carlo Value Iteration with Macro-Actions »
Zhan Wei Lim · David Hsu · Wee Sun Lee -
2010 Session: Oral Session 2 »
Wee Sun Lee -
2009 Poster: Conditional Random Fields with High-Order Features for Sequence Labeling »
Nan Ye · Wee Sun Lee · Hai Leong Chieu · Dan Wu -
2007 Poster: Cooled and Relaxed Survey Propagation for MRFs »
Hai Leong Chieu · Wee Sun Lee · Yee Whye Teh -
2007 Spotlight: Cooled and Relaxed Survey Propagation for MRFs »
Hai Leong Chieu · Wee Sun Lee · Yee Whye Teh -
2007 Spotlight: What makes some POMDP problems easy to approximate? »
David Hsu · Wee Sun Lee · Nan Rong -
2007 Poster: What makes some POMDP problems easy to approximate? »
David Hsu · Wee Sun Lee · Nan Rong -
2006 Poster: Hyperparameter Learning for Graph Based Semi-supervised Learning Algorithms »
Xinhua Zhang · Wee Sun Lee