Timezone: »
In the infinite-armed bandit problem, each arm's average reward is sampled from an unknown distribution, and each arm can be sampled further to obtain noisy estimates of the average reward of that arm. Prior work focuses on the best arm, i.e. estimating the maximum of the average reward distribution. We consider a general class of distribution functionals beyond the maximum and obtain optimal sample complexities in both offline and online settings. We show that online estimation, where the learner can sequentially choose whether to sample a new or existing arm, offers no advantage over the offline setting for estimating the mean functional, but significantly reduces the sample complexity for other functionals such as the median, maximum, and trimmed mean. We propose unified meta algorithms for the online and offline settings and derive matching lower bounds using different Wasserstein distances. For the special case of median estimation, we identify a curious thresholding phenomenon on the indistinguishability between Gaussian convolutions with respect to the noise level, which may be of independent interest.
Author Information
Yifei Wang (Stanford University)
Tavor Baharav (Stanford University)
I am a second year PhD student in Electrical Engineering at Stanford University working with Professor David Tse, recently on developing fast (near linear time) randomized algorithms using techniques from multi-armed bandits. I am grateful to be supported by the NSF Graduate Research Fellowship and the Stanford Graduate Fellowship (SGF). I graduated from UC Berkeley in May 2018 where I studied Electrical Engineering and Computer Science. In my time there, I was fortunate to get the chance to work with Professor Kannan Ramchandran on coding theory and its applications to distributed computing. My current research focus is on constructing algorithms that adapt to problem instance difficulty, and more broadly in randomized algorithms, machine learning, multi-armed bandits, and their applications in engineering and computational biology problems.
Yanjun Han (Massachusetts Institute of Technology)
Jiantao Jiao (University of California, Berkeley)
David Tse (Stanford University)
More from the Same Authors
-
2022 Spotlight: Leveraging the Hints: Adaptive Bidding in Repeated First-Price Auctions »
Wei Zhang · Yanjun Han · Zhengyuan Zhou · Aaron Flores · Tsachy Weissman -
2022 Spotlight: Lightning Talks 3B-1 »
Tianying Ji · Tongda Xu · Giulia Denevi · Aibek Alanov · Martin Wistuba · Wei Zhang · Yuesong Shen · Massimiliano Pontil · Vadim Titov · Yan Wang · Yu Luo · Daniel Cremers · Yanjun Han · Arlind Kadra · Dailan He · Josif Grabocka · Zhengyuan Zhou · Fuchun Sun · Carlo Ciliberto · Dmitry Vetrov · Mingxuan Jing · Chenjian Gao · Aaron Flores · Tsachy Weissman · Han Gao · Fengxiang He · Kunzan Liu · Wenbing Huang · Hongwei Qin -
2022 Poster: Oracle-Efficient Online Learning for Smoothed Adversaries »
Nika Haghtalab · Yanjun Han · Abhishek Shetty · Kunhe Yang -
2022 Poster: Leveraging the Hints: Adaptive Bidding in Repeated First-Price Auctions »
Wei Zhang · Yanjun Han · Zhengyuan Zhou · Aaron Flores · Tsachy Weissman -
2022 Poster: Minimax Optimal Online Imitation Learning via Replay Estimation »
Gokul Swamy · Nived Rajaraman · Matt Peng · Sanjiban Choudhury · J. Bagnell · Steven Wu · Jiantao Jiao · Kannan Ramchandran -
2021 Poster: Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism »
Paria Rashidinejad · Banghua Zhu · Cong Ma · Jiantao Jiao · Stuart Russell -
2021 Poster: On the Value of Interaction and Function Approximation in Imitation Learning »
Nived Rajaraman · Yanjun Han · Lin Yang · Jingbo Liu · Jiantao Jiao · Kannan Ramchandran -
2021 Poster: MADE: Exploration via Maximizing Deviation from Explored Regions »
Tianjun Zhang · Paria Rashidinejad · Jiantao Jiao · Yuandong Tian · Joseph Gonzalez · Stuart Russell -
2020 Poster: Adaptive Learning of Rank-One Models for Efficient Pairwise Sequence Alignment »
Govinda Kamath · Tavor Baharav · Ilan Shomorony -
2020 Poster: Toward the Fundamental Limits of Imitation Learning »
Nived Rajaraman · Lin Yang · Jiantao Jiao · Kannan Ramchandran -
2020 Poster: SLIP: Learning to Predict in Unknown Dynamical Systems with Long-Term Memory »
Paria Rashidinejad · Jiantao Jiao · Stuart Russell -
2020 Oral: SLIP: Learning to Predict in Unknown Dynamical Systems with Long-Term Memory »
Paria Rashidinejad · Jiantao Jiao · Stuart Russell -
2019 Workshop: Information Theory and Machine Learning »
Shengjia Zhao · Jiaming Song · Yanjun Han · Kristy Choi · Pratyusha Kalluri · Ben Poole · Alex Dimakis · Jiantao Jiao · Tsachy Weissman · Stefano Ermon -
2019 Poster: Ultra Fast Medoid Identification via Correlated Sequential Halving »
Tavor Baharav · David Tse -
2018 Poster: Porcupine Neural Networks: Approximating Neural Network Landscapes »
Soheil Feizi · Hamid Javadi · Jesse Zhang · David Tse -
2018 Poster: A Convex Duality Framework for GANs »
Farzan Farnia · David Tse -
2017 Poster: Tensor Biclustering »
Soheil Feizi · Hamid Javadi · David Tse -
2017 Poster: NeuralFDR: Learning Discovery Thresholds from Hypothesis Features »
Fei Xia · Martin J Zhang · James Zou · David Tse -
2016 Poster: A Minimax Approach to Supervised Learning »
Farzan Farnia · David Tse -
2015 Poster: Discrete Rényi Classifiers »
Meisam Razaviyayn · Farzan Farnia · David Tse