Timezone: »
Suppose an online platform wants to compare a treatment and control policy (e.g., two different matching algorithms in a ridesharing system, or two different inventory management algorithms in an online retail site). Standard experimental approaches to this problem are biased (due to temporal interference between the policies), and not sample efficient. We study optimal experimental design for this setting. We view testing the two policies as the problem of estimating the steady state difference in reward between two unknown Markov chains (i.e., policies). We assume estimation of the steady state reward for each chain proceeds via nonparametric maximum likelihood, and search for consistent (i.e., asymptotically unbiased) experimental designs that are efficient (i.e., asymptotically minimum variance). Characterizing such designs is equivalent to a Markov decision problem with a minimum variance objective; such problems generally do not admit tractable solutions. Remarkably, in our setting, using a novel application of classical martingale analysis of Markov chains via Poisson's equation, we characterize efficient designs via a succinct convex optimization problem. We use this characterization to propose a consistent, efficient online experimental design that adaptively samples the two Markov chains.
Author Information
Peter W Glynn (Stanford University)
Peter W. Glynn is the Thomas Ford Professor in the Department of Management Science and Engineering (MS&E) at Stanford University, and also holds a courtesy appointment in the Department of Electrical Engineering. He received his Ph.D in Operations Research from Stanford University in 1982. He then joined the faculty of the University of Wisconsin at Madison, where he held a joint appointment between the Industrial Engineering Department and Mathematics Research Center, and courtesy appointments in Computer Science and Mathematics. In 1987, he returned to Stanford, where he joined the Department of Operations Research. He was Director of Stanford's Institute for Computational and Mathematical Engineering from 2006 until 2010 and served as Chair of MS&E from 2011 through 2015. He is a Fellow of INFORMS and a Fellow of the Institute of Mathematical Statistics, and was an IMS Medallion Lecturer in 1995 and INFORMS Markov Lecturer in 2014. He was co-winner of the Outstanding Publication Awards from the INFORMS Simulation Society in 1993, 2008, and 2016, was a co-winner of the Best (Biannual) Publication Award from the INFORMS Applied Probability Society in 2009, and was the co-winner of the John von Neumann Theory Prize from INFORMS in 2010. In 2012, he was elected to the National Academy of Engineering. He was Founding Editor-in-Chief of Stochastic Systems and is currently Editor-in-Chief of Journal of Applied Probability and Advances in Applied Probability. His research interests lie in simulation, computational probability, queueing theory, statistical inference for stochastic processes, and stochastic modeling.
Ramesh Johari (Stanford University)
Mohammad Rasouli (Stanford University)
More from the Same Authors
-
2021 Poster: Modified Frank Wolfe in Probability Space »
Carson Kent · Jiajin Li · Jose Blanchet · Peter W Glynn -
2020 Poster: Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms »
Mohsen Bayati · Nima Hamidi · Ramesh Johari · Khashayar Khosravi -
2020 Spotlight: Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms »
Mohsen Bayati · Nima Hamidi · Ramesh Johari · Khashayar Khosravi -
2019 Poster: Multivariate Distributionally Robust Convex Regression under Absolute Error Loss »
Jose Blanchet · Peter W Glynn · Jun Yan · Zhengqing Zhou -
2019 Poster: Semi-Parametric Dynamic Contextual Pricing »
Virag Shah · Ramesh Johari · Jose Blanchet -
2018 Poster: Bandit Learning with Positive Externalities »
Virag Shah · Jose Blanchet · Ramesh Johari -
2018 Poster: Learning in Games with Lossy Feedback »
Zhengyuan Zhou · Panayotis Mertikopoulos · Susan Athey · Nicholas Bambos · Peter W Glynn · Yinyu Ye -
2017 Poster: Countering Feedback Delays in Multi-Agent Learning »
Zhengyuan Zhou · Panayotis Mertikopoulos · Nicholas Bambos · Peter W Glynn · Claire Tomlin -
2017 Poster: Stochastic Mirror Descent in Variationally Coherent Optimization Problems »
Zhengyuan Zhou · Panayotis Mertikopoulos · Nicholas Bambos · Stephen Boyd · Peter W Glynn