Decentralized (PO)MDPs provide an expressive framework for sequential decision making in a multiagent system. Given their computational complexity, recent research has focused on tractable yet practical subclasses of Dec-POMDPs. We address such a subclass called CDec-POMDP where the collective behavior of a population of agents affects the joint-reward and environment dynamics. Our main contribution is an actor-critic (AC) reinforcement learning method for optimizing CDec-POMDP policies. Vanilla AC has slow convergence for larger problems. To address this, we show how a particular decomposition of the approximate action-value function over agents leads to effective updates, and also derive a new way to train the critic based on local reward signals. Comparisons on a synthetic benchmark and a real world taxi fleet optimization problem show that our new AC approach provides better quality solutions than previous best approaches.
Duc Nguyen (Singapore Management University)
Akshat Kumar (Singapore Management University)
Akshat Kumar is an Assistant Professor in the School of Information Systems at the Singapore Management University. His research interests lie in the area of planning and decision making under uncertainty with a focus on multiagent systems and urban system optimization. Kumar received a Ph.D. in computer science from the University of Massachusetts Amherst advised by Prof. Shlomo Zilberstein. His thesis received best dissertation award at the ICAPS 2014 conference and a runner-up award at AAMAS 2013. His work has also received the outstanding application paper award at ICAPS 2014 and the best paper award in the AAAI 2017 computational sustainability track.
Hoong Chuin Lau (Singapore Management University)
Hoong Chuin LAU is Professor of Information Systems and Director of the Fujitsu-SMU Urban Computing and Engineering Corporate Lab at the Singapore Management University (SMU). Prior to joining SMU, he was a research scientist at the Institute of Inforcomm Research in Singapore (1997-1999), and assistant professor at the School of Computing, National University of Singapore (2000-2005). The common thread running through his research is a focus on going beyond publications to build usable novel tools and prototypes, a number of which have been testbedded and deployed in industry. Working at the interface of Operations Research and Artificial Intelligence, he is interested in combining data analytics and optimization for decision-making in business. More specifically, he works on optimization models, agent-based models, and metaheuristics for data-driven resource planning, scheduling and coordination problems in logistics, transportation, and travel planning. Recently, his focus is on such planning and operations problems in urban contexts. He was awarded the Lee Kwan Yew Fellowship for research excellence in 2008. He currently serves on the editorial board of the IEEE Transactions on Automation Science and Engineering, and the Web Intelligence Journal. He has been involved in consulting projects in logistics and transportation, for companies such as DHL, Fedex, Bax Global, PSA, EADS, ST Dynamics, and various government agencies. He is a chartered member of the Chartered Institute of Logistics and Transportation, and currently serves on the CILT Singapore board of directors. For his work with the Singapore Ministry of Defense, he won the National Innovation and Quality Convention Star Award in 2006, and was nominated for the prestigious Defense Technology Prize (individual category) in 2007. Throughout his 18 years of full-time university lecturing experience, he taught a number of undergraduate and graduate courses, including Design and Analysis of Algorithms, Combinatorial Graph Algorithms, Computer as an Analysis Tool, Computational Thinking, Decision Analytics and Optimization, Advanced Topics in Intelligent Systems, and Enterprise Analytics for Decision Support. He is co-author of the textbook “Business Analytics for Decision Making” published by CRC Press in 2016. Twice a recipient of Singapore government scholarship from the Infocomm Development Authority (IDA), Hoong Chuin obtained his Doctorate of Engineering degree in Computer Science from the Tokyo Institute of Technology (Japan) in 1996, and BSc and MSc degrees in Computer Science from the University of Minnesota (Minneapolis, USA) in 1987 and 1988.
More from the Same Authors
2018 Poster: Credit Assignment For Collective Multiagent RL With Global Rewards »
Duc Thien Nguyen · Akshat Kumar · Hoong Chuin Lau