Timezone: »
Poster
Improved Regret Bounds for Bandit Combinatorial Optimization
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi
Wed Dec 11 05:00 PM -- 07:00 PM (PST) @ East Exhibition Hall B + C #1
\textit{Bandit combinatorial optimization} is a bandit framework in which a player chooses an action within a given finite set $\mathcal{A} \subseteq \{ 0, 1 \}^d$ and incurs a loss that is the inner product of the chosen action and an unobservable loss vector in $\mathbb{R} ^ d$ in each round. In this paper, we aim to reveal the property, which makes the bandit combinatorial optimization hard. Recently, Cohen et al.~\citep{cohen2017tight} obtained a lower bound $\Omega(\sqrt{d k^3 T / \log T})$ of the regret, where $k$ is the maximum $\ell_1$-norm of action vectors, and $T$ is the number of rounds. This lower bound was achieved by considering a continuous strongly-correlated distribution of losses. Our main contribution is that we managed to improve this bound by $\Omega( \sqrt{d k ^3 T} )$ through applying a factor of $\sqrt{\log T}$, which can be done by means of strongly-correlated losses with \textit{binary} values. The bound derives better regret bounds for three specific examples of the bandit combinatorial optimization: the multitask bandit, the bandit ranking and the multiple-play bandit. In particular, the bound obtained for the bandit ranking in the present study addresses an open problem raised in \citep{cohen2017tight}. In addition, we demonstrate that the problem becomes easier without considering correlations among entries of loss vectors. In fact, if each entry of loss vectors is an independent random variable, then, one can achieve a regret of $\tilde{O}(\sqrt{d k^2 T})$, which is $\sqrt{k}$ times smaller than the lower bound shown above. The observed results indicated that correlation among losses is the reason for observing a large regret.
Author Information
Shinji Ito (NEC Corporation)
Daisuke Hatano (RIKEN AIP)
Hanna Sumita (Tokyo Metropolitan University)
Kei Takemura (NEC Corporation)
Takuro Fukunaga (Chuo University, JST PRESTO, RIKEN AIP)
Naonori Kakimura (Keio University)
Ken-Ichi Kawarabayashi (National Institute of Informatics)
More from the Same Authors
-
2022 Poster: Average Sensitivity of Euclidean k-Clustering »
Yuichi Yoshida · Shinji Ito -
2022 Poster: Single Loop Gaussian Homotopy Method for Non-convex Optimization »
Hidenori Iwakiri · Yuhang Wang · Shinji Ito · Akiko Takeda -
2022 Poster: Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs »
Shinji Ito · Taira Tsuchiya · Junya Honda -
2021 Poster: On Optimal Robustness to Adversarial Corruption in Online Decision Problems »
Shinji Ito -
2021 Poster: Hybrid Regret Bounds for Combinatorial Semi-Bandits and Adversarial Linear Bandits »
Shinji Ito -
2020 Poster: Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits »
Shinji Ito · Shuichi Hirahara · Tasuku Soma · Yuichi Yoshida -
2020 Poster: A Tight Lower Bound and Efficient Reduction for Swap Regret »
Shinji Ito -
2020 Poster: Delay and Cooperation in Nonstochastic Linear Bandits »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2020 Spotlight: A Tight Lower Bound and Efficient Reduction for Swap Regret »
Shinji Ito -
2020 Spotlight: Delay and Cooperation in Nonstochastic Linear Bandits »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2020 Spotlight: Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits »
Shinji Ito · Shuichi Hirahara · Tasuku Soma · Yuichi Yoshida -
2019 Poster: Submodular Function Minimization with Noisy Evaluation Oracle »
Shinji Ito -
2019 Poster: Oracle-Efficient Algorithms for Online Linear Optimization with Bandit Feedback »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2018 Poster: Regret Bounds for Online Portfolio Selection with a Cardinality Constraint »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Akihiro Yabe · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2017 Poster: Efficient Sublinear-Regret Algorithms for Online Sparse Linear Regression with Limited Observation »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Akihiro Yabe · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi