Timezone: »
It is well known that in stochastic multi-armed bandits (MAB), the sample mean of an arm is typically not an unbiased estimator of its true mean. In this paper, we decouple three different sources of this selection bias: adaptive \emph{sampling} of arms, adaptive \emph{stopping} of the experiment, and adaptively \emph{choosing} which arm to study. Through a new notion called ``optimism'' that captures certain natural monotonic behaviors of algorithms, we provide a clean and unified analysis of how optimistic rules affect the sign of the bias. The main takeaway message is that optimistic sampling induces a negative bias, but optimistic stopping and optimistic choosing both induce a positive bias. These results are derived in a general stochastic MAB setup that is entirely agnostic to the final aim of the experiment (regret minimization or best-arm identification or anything else). We provide examples of optimistic rules of each type, demonstrate that simulations confirm our theoretical predictions, and pose some natural but hard open problems.
Author Information
Jaehyeok Shin (Carnegie Mellon University)
Aaditya Ramdas (Carnegie Mellon University)
Alessandro Rinaldo (CMU)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Are sample means in multi-armed bandits positively or negatively biased? »
Thu. Dec 12th 01:00 -- 03:00 AM Room East Exhibition Hall B + C
More from the Same Authors
-
2021 : Doubly robust confidence sequences »
Ian Waudby-Smith · David Arbour · Ritwik Sinha · Edward Kennedy · Aaditya Ramdas -
2021 Poster: Lattice partition recovery with dyadic CART »
OSCAR HERNAN MADRID PADILLA · Yi Yu · Alessandro Rinaldo -
2019 Poster: Statistical Analysis of Nearest Neighbor Methods for Anomaly Detection »
Xiaoyi Gu · Leman Akoglu · Alessandro Rinaldo -
2019 Poster: ADDIS: an adaptive discarding algorithm for online FDR control with conservative nulls »
Jinjin Tian · Aaditya Ramdas -
2017 : Persistent homology of KDE filtration of Rips complexes »
Jaehyeok Shin · Alessandro Rinaldo -
2017 Poster: A Sharp Error Analysis for the Fused Lasso, with Application to Approximate Changepoint Screening »
Kevin Lin · James Sharpnack · Alessandro Rinaldo · Ryan Tibshirani -
2016 Poster: Statistical Inference for Cluster Trees »
Jisu KIM · Yen-Chi Chen · Sivaraman Balakrishnan · Alessandro Rinaldo · Larry Wasserman -
2013 Poster: Cluster Trees on Manifolds »
Sivaraman Balakrishnan · Srivatsan Narayanan · Alessandro Rinaldo · Aarti Singh · Larry Wasserman -
2012 Workshop: Algebraic Topology and Machine Learning »
Sivaraman Balakrishnan · Alessandro Rinaldo · Donald Sheehy · Aarti Singh · Larry Wasserman -
2011 Poster: Minimax Localization of Structural Information in Large Noisy Matrices »
Mladen Kolar · Sivaraman Balakrishnan · Alessandro Rinaldo · Aarti Singh -
2011 Spotlight: Minimax Localization of Structural Information in Large Noisy Matrices »
Mladen Kolar · Sivaraman Balakrishnan · Alessandro Rinaldo · Aarti Singh