Timezone: »
Poster
Thresholding Bandit with Optimal Aggregate Regret
Chao Tao · Saúl Blanco · Jian Peng · Yuan Zhou
Tue Dec 10 05:30 PM -- 07:30 PM (PST) @ East Exhibition Hall B + C #22
We consider the thresholding bandit problem, whose goal is to find arms of mean rewards above a given threshold $\theta$, with a fixed budget of $T$ trials. We introduce LSA, a new, simple and anytime algorithm that aims to minimize the aggregate regret (or the expected number of mis-classified arms). We prove that our algorithm is instance-wise asymptotically optimal. We also provide comprehensive empirical results to demonstrate the algorithm's superior performance over existing algorithms under a variety of different scenarios.
Author Information
Chao Tao (Indiana University Bloomington)
Saúl Blanco (Indiana University)
Jian Peng (University of Illinois at Urbana-Champaign)
Yuan Zhou (UIUC)
More from the Same Authors
-
2021 : Imitation Learning from Observations under Transition Model Disparity »
Tanmay Gangwani · Yuan Zhou · Jian Peng -
2021 : Hindsight Foresight Relabeling for Meta-Reinforcement Learning »
Michael Wan · Jian Peng · Tanmay Gangwani -
2022 Poster: Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation »
Zhizhou Ren · Anji Liu · Yitao Liang · Jian Peng · Jianzhu Ma -
2022 Poster: Antigen-Specific Antibody Design and Optimization with Diffusion-Based Generative Models for Protein Structures »
Shitong Luo · Yufeng Su · Xingang Peng · Sheng Wang · Jian Peng · Jianzhu Ma -
2021 Poster: A 3D Generative Model for Structure-Based Drug Design »
Shitong Luo · Jiaqi Guan · Jianzhu Ma · Jian Peng -
2020 Poster: Almost Optimal Model-Free Reinforcement Learningvia Reference-Advantage Decomposition »
Zihan Zhang · Yuan Zhou · Xiangyang Ji -
2020 Poster: Learning Guidance Rewards with Trajectory-space Smoothing »
Tanmay Gangwani · Yuan Zhou · Jian Peng -
2020 Poster: Off-Policy Interval Estimation with Lipschitz Value Iteration »
Ziyang Tang · Yihao Feng · Na Zhang · Jian Peng · Qiang Liu -
2019 Poster: Exploration via Hindsight Goal Generation »
Zhizhou Ren · Kefan Dong · Yuan Zhou · Qiang Liu · Jian Peng