Timezone: »
In this paper, we study the multi-armed bandit problem in the batched setting where the employed policy must split data into a small number of batches. While the minimax regret for the two-armed stochastic bandits has been completely characterized in \cite{perchet2016batched}, the effect of the number of arms on the regret for the multi-armed case is still open. Moreover, the question whether adaptively chosen batch sizes will help to reduce the regret also remains underexplored. In this paper, we propose the BaSE (batched successive elimination) policy to achieve the rate-optimal regrets (within logarithmic factors) for batched multi-armed bandits, with matching lower bounds even if the batch sizes are determined in an adaptive manner.
Author Information
Zijun Gao (Stanford University)
Yanjun Han (Stanford University)
Zhimei Ren (Stanford University)
Zhengqing Zhou (Stanford University)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Batched Multi-armed Bandits Problem »
Thu. Dec 12th 01:00 -- 03:00 AM Room East Exhibition Hall B + C #49
More from the Same Authors
-
2021 Poster: Optimal prediction of Markov chains with and without spectral gap »
Yanjun Han · Soham Jana · Yihong Wu -
2021 Poster: On the Value of Interaction and Function Approximation in Imitation Learning »
Nived Rajaraman · Yanjun Han · Lin Yang · Jingbo Liu · Jiantao Jiao · Kannan Ramchandran -
2021 Poster: Online Multi-Armed Bandits with Adaptive Inference »
Maria Dimakopoulou · Zhimei Ren · Zhengyuan Zhou -
2020 Poster: Minimax Optimal Nonparametric Estimation of Heterogeneous Treatment Effects »
Zijun Gao · Yanjun Han -
2020 Spotlight: Minimax Optimal Nonparametric Estimation of Heterogeneous Treatment Effects »
Zijun Gao · Yanjun Han -
2019 : Poster Session »
Gergely Flamich · Shashanka Ubaru · Charles Zheng · Josip Djolonga · Kristoffer Wickstrøm · Diego Granziol · Konstantinos Pitas · Jun Li · Robert Williamson · Sangwoong Yoon · Kwot Sin Lee · Julian Zilly · Linda Petrini · Ian Fischer · Zhe Dong · Alexander Alemi · Bao-Ngoc Nguyen · Rob Brekelmans · Tailin Wu · Aditya Mahajan · Alexander Li · Kirankumar Shiragur · Yair Carmon · Linara Adilova · SHIYU LIU · Bang An · Sanjeeb Dash · Oktay Gunluk · Arya Mazumdar · Mehul Motani · Julia Rosenzweig · Michael Kamp · Marton Havasi · Leighton P Barnes · Zhengqing Zhou · Yi Hao · Dylan Foster · Yuval Benjamini · Nati Srebro · Michael Tschannen · Paul Rubenstein · Sylvain Gelly · John Duchi · Aaron Sidford · Robin Ru · Stefan Zohren · Murtaza Dalal · Michael A Osborne · Stephen J Roberts · Moses Charikar · Jayakumar Subramanian · Xiaodi Fan · Max Schwarzer · Nicholas Roberts · Simon Lacoste-Julien · Vinay Prabhu · Aram Galstyan · Greg Ver Steeg · Lalitha Sankar · Yung-Kyun Noh · Gautam Dasarathy · Frank Park · Ngai-Man (Man) Cheung · Ngoc-Trung Tran · Linxiao Yang · Ben Poole · Andrea Censi · Tristan Sylvain · R Devon Hjelm · Bangjie Liu · Jose Gallego-Posada · Tyler Sypherd · Kai Yang · Jan Nikolas Morshuis -
2019 Workshop: Information Theory and Machine Learning »
Shengjia Zhao · Jiaming Song · Yanjun Han · Kristy Choi · Pratyusha Kalluri · Ben Poole · Alex Dimakis · Jiantao Jiao · Tsachy Weissman · Stefano Ermon -
2019 Poster: Multivariate Distributionally Robust Convex Regression under Absolute Error Loss »
Jose Blanchet · Peter W Glynn · Jun Yan · Zhengqing Zhou -
2018 Poster: The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal »
Jiantao Jiao · Weihao Gao · Yanjun Han -
2018 Poster: Entropy Rate Estimation for Markov Chains with Large State Space »
Yanjun Han · Jiantao Jiao · Chuan-Zheng Lee · Tsachy Weissman · Yihong Wu · Tiancheng Yu -
2018 Spotlight: Entropy Rate Estimation for Markov Chains with Large State Space »
Yanjun Han · Jiantao Jiao · Chuan-Zheng Lee · Tsachy Weissman · Yihong Wu · Tiancheng Yu -
2018 Spotlight: The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal »
Jiantao Jiao · Weihao Gao · Yanjun Han