Timezone: »
Poster
Batch-Size Independent Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms or Independent Arms
Xutong Liu · Jinhang Zuo · Siwei Wang · Carlee Joe-Wong · John C.S. Lui · Wei Chen
In this paper, we study the combinatorial semi-bandits (CMAB) and focus on reducing the dependency of the batch-size $K$ in the regret bound, where $K$ is the total number of arms that can be pulled or triggered in each round. First, for the setting of CMAB with probabilistically triggered arms (CMAB-T), we discover a novel (directional) triggering probability and variance modulated (TPVM) condition that can replace the previously-used smoothness condition for various applications, such as cascading bandits, online network exploration and online influence maximization. Under this new condition, we propose a BCUCB-T algorithm with variance-aware confidence intervals and conduct regret analysis which reduces the $O(K)$ factor to $O(\log K)$ or $O(\log^2 K)$ in the regret bound, significantly improving the regret bounds for the above applications. Second, for the setting of non-triggering CMAB with independent arms, we propose a SESCB algorithm which leverages on the non-triggering version of the TPVM condition and completely removes the dependency on $K$ in the leading regret. As a valuable by-product, the regret analysis used in this paper can improve several existing results by a factor of $O(\log K)$. Finally, experimental evaluations show our superior performance compared with benchmark algorithms in different applications.
Author Information
Xutong Liu (The Chinese University of Hong Kong)
Jinhang Zuo (University of Massachusetts at Amherst)
Siwei Wang (IIIS, Tsinghua University)
Carlee Joe-Wong (Carnegie Mellon University)
John C.S. Lui (Chinese University of Hong Kong)
Wei Chen (Microsoft Research)
More from the Same Authors
-
2021 : A Generalized and Distributable Generative Model for Private Representation Learning »
Sheikh Shams Azam · Taejin Kim · Seyyedali Hosseinalipour · Carlee Joe-Wong · Saurabh Bagchi · Christopher Brinton -
2021 : A Generalized and Distributable Generative Model for Private Representation Learning »
Sheikh Shams Azam · Taejin Kim · Seyyedali Hosseinalipour · Carlee Joe-Wong · Saurabh Bagchi · Christopher Brinton -
2022 Poster: Combinatorial Bandits with Linear Constraints: Beyond Knapsacks and Fairness »
Qingsong Liu · Weihang Xu · Siwei Wang · Zhixuan Fang -
2022 Poster: Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret »
Jiawei Huang · Li Zhao · Tao Qin · Wei Chen · Nan Jiang · Tie-Yan Liu -
2022 : FedRule: Federated Rule Recommendation System with Graph Neural Networks »
Yuhang Yao · Mohammad Mahdi Kamani · Zhongwei Cheng · Lin Chen · Carlee Joe-Wong · Tianqiang Liu -
2022 Spotlight: Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret »
Jiawei Huang · Li Zhao · Tao Qin · Wei Chen · Nan Jiang · Tie-Yan Liu -
2022 Spotlight: Lightning Talks 4A-1 »
Jiawei Huang · Su Jia · Abdurakhmon Sadiev · Ruomin Huang · Yuanyu Wan · Denizalp Goktas · Jiechao Guan · Andrew Li · Wei-Wei Tu · Li Zhao · Amy Greenwald · Jiawei Huang · Dmitry Kovalev · Yong Liu · Wenjie Liu · Peter Richtarik · Lijun Zhang · Zhiwu Lu · R Ravi · Tao Qin · Wei Chen · Hu Ding · Nan Jiang · Tie-Yan Liu -
2022 Poster: Matching in Multi-arm Bandit with Collision »
YiRui Zhang · Siwei Wang · Zhixuan Fang -
2021 Poster: Continuous Mean-Covariance Bandits »
Yihan Du · Siwei Wang · Zhixuan Fang · Longbo Huang -
2021 Poster: Combinatorial Pure Exploration with Bottleneck Reward Function »
Yihan Du · Yuko Kuroki · Wei Chen -
2021 Poster: The Hardness Analysis of Thompson Sampling for Combinatorial Semi-bandits with Greedy Oracle »
Fang Kong · Yueran Yang · Wei Chen · Shuai Li -
2020 Poster: Online Influence Maximization under Linear Threshold Model »
Shuai Li · Fang Kong · Kejie Tang · Qizhi Li · Wei Chen -
2020 Poster: Restless-UCB, an Efficient and Low-complexity Algorithm for Online Restless Bandits »
Siwei Wang · Longbo Huang · John C. S. Lui -
2019 Poster: Adaptive Influence Maximization with Myopic Feedback »
Binghui Peng · Wei Chen -
2018 Poster: Multi-armed Bandits with Compensation »
Siwei Wang · Longbo Huang -
2018 Poster: Community Exploration: From Offline Optimization to Online Learning »
Xiaowei Chen · Weiran Huang · Wei Chen · John C. S. Lui -
2017 Poster: Improving Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms and Its Applications »
Qinshi Wang · Wei Chen -
2017 Poster: Influence Maximization with $\varepsilon$-Almost Submodular Threshold Functions »
Qiang Li · Wei Chen · Institute of Computing Xiaoming Sun · Institute of Computing Jialin Zhang -
2016 Poster: Combinatorial Multi-Armed Bandit with General Reward Functions »
Wei Chen · Wei Hu · Fu Li · Jian Li · Yu Liu · Pinyan Lu -
2015 Poster: Stochastic Online Greedy Learning with Semi-bandit Feedbacks »
Tian Lin · Jian Li · Wei Chen -
2014 Poster: Combinatorial Pure Exploration of Multi-Armed Bandits »
Shouyuan Chen · Tian Lin · Irwin King · Michael R Lyu · Wei Chen -
2014 Oral: Combinatorial Pure Exploration of Multi-Armed Bandits »
Shouyuan Chen · Tian Lin · Irwin King · Michael R Lyu · Wei Chen