Timezone: »
This study aims to develop bandit algorithms that automatically exploit tendencies of certain environments to improve performance, without any prior knowledge regarding the environments. We first propose an algorithm for combinatorial semi-bandits with a hybrid regret bound that includes two main features: a best-of-three-worlds guarantee and multiple data-dependent regret bounds. The former means that the algorithm will work nearly optimally in all environments in an adversarial setting, a stochastic setting, or a stochastic setting with adversarial corruptions. The latter implies that, even if the environment is far from exhibiting stochastic behavior, the algorithm will perform better as long as the environment is "easy" in terms of certain metrics. The metrics w.r.t. the easiness referred to in this paper include cumulative loss for optimal actions, total quadratic variation of losses, and path-length of a loss sequence. We also show hybrid data-dependent regret bounds for adversarial linear bandits, which include a first path-length regret bound that is tight up to logarithmic factors.
Author Information
Shinji Ito (NEC Corporation)
More from the Same Authors
-
2022 Poster: Average Sensitivity of Euclidean k-Clustering »
Yuichi Yoshida · Shinji Ito -
2022 Poster: Single Loop Gaussian Homotopy Method for Non-convex Optimization »
Hidenori Iwakiri · Yuhang Wang · Shinji Ito · Akiko Takeda -
2022 Poster: Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs »
Shinji Ito · Taira Tsuchiya · Junya Honda -
2021 Poster: On Optimal Robustness to Adversarial Corruption in Online Decision Problems »
Shinji Ito -
2020 Poster: Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits »
Shinji Ito · Shuichi Hirahara · Tasuku Soma · Yuichi Yoshida -
2020 Poster: A Tight Lower Bound and Efficient Reduction for Swap Regret »
Shinji Ito -
2020 Poster: Delay and Cooperation in Nonstochastic Linear Bandits »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2020 Spotlight: A Tight Lower Bound and Efficient Reduction for Swap Regret »
Shinji Ito -
2020 Spotlight: Delay and Cooperation in Nonstochastic Linear Bandits »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2020 Spotlight: Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits »
Shinji Ito · Shuichi Hirahara · Tasuku Soma · Yuichi Yoshida -
2019 Poster: Improved Regret Bounds for Bandit Combinatorial Optimization »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2019 Poster: Submodular Function Minimization with Noisy Evaluation Oracle »
Shinji Ito -
2019 Poster: Oracle-Efficient Algorithms for Online Linear Optimization with Bandit Feedback »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi