We address the problem of regret minimization in logistic contextual bandits, where a learner decides among sequential actions or arms given their respective contexts to maximize binary rewards. Using a fast inference procedure with Polya-Gamma distributed augmentation variables, we propose an improved version of Thompson Sampling, a Bayesian formulation of contextual bandits with near-optimal performance. Our approach, Polya-Gamma augmented Thompson Sampling (PG-TS), achieves state-of-the-art performance on simulated and real data. PG-TS explores the action space efficiently and exploits high-reward arms, quickly converging to solutions of low regret. Its explicit estimation of the posterior distribution of the context feature covariance leads to substantial empirical gains over approximate approaches. PG-TS is the first approach to demonstrate the benefits of Polya-Gamma augmentation in bandits and to propose an efficient Gibbs sampler for approximating the analytically unsolvable integral of logistic contextual bandits.
Bianca Dumitrascu (Princeton University)
Karen Feng (Princeton University)
Barbara Engelhardt (Princeton University)
Barbara E. Engelhardt is an associate professor in the Princeton Computer Science Department, on leave in 2019-2020 working as a principal scientist at Genomics Plc. Previously, she was an assistant professor at Duke University in Biostatistics and Bioinformatics and Statistical Sciences. She graduated from Stanford University and received her Ph.D. from the University of California, Berkeley, advised by Professor Michael Jordan. She did postdoctoral research at the University of Chicago, working with Professor Matthew Stephens. Interspersed among her academic experiences, she spent two years working at the Jet Propulsion Laboratory, a summer at Google Research, and a year at 23andMe, a DNA ancestry service. Professor Engelhardt received an NSF Graduate Research Fellowship, the Google Anita Borg Memorial Scholarship, and the Walter M. Fitch Prize from the Society for Molecular Biology and Evolution. As a faculty member, she received the NIH NHGRI K99/R00 Pathway to Independence Award, a Sloan Faculty Fellowship, and an NSF CAREER Award. Professor Engelhardt’s research interests involve developing statistical models and methods for the analysis of high-dimensional biomedical data, with a goal of understanding the underlying biological mechanisms and dynamics of complex phenotypes and human disease.
More from the Same Authors
2021 : Offline Reinforcement Learning for Hospital Patients When Every Patient is Different »
2021 : Invited Speaker Panel »
Sham Kakade · Minmin Chen · Philip Thomas · Angela Schoellig · Barbara Engelhardt · Doina Precup · George Tucker
2020 Workshop: Learning Meaningful Representations of Life (LMRL.org) »
Elizabeth Wood · Debora Marks · Ray Jones · Adji Bousso Dieng · Alan Aspuru-Guzik · Anshul Kundaje · Barbara Engelhardt · Chang Liu · Edward Boyden · Kresten Lindorff-Larsen · Mor Nitzan · Smita Krishnaswamy · Wouter Boomsma · Yixin Wang · David Van Valen · Orr Ashenberg
2019 Tutorial: Machine Learning for Computational Biology and Health »
Anna Goldenberg · Barbara Engelhardt