Timezone: »
Spotlight
Delay and Cooperation in Nonstochastic Linear Bandits
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi
Wed Dec 09 07:20 PM -- 07:30 PM (PST) @ Orals & Spotlights: Learning Theory
This paper offers a nearly optimal algorithm for online linear optimization with delayed bandit feedback. Online linear optimization with bandit feedback, or nonstochastic linear bandits, provides a generic framework for sequential decision-making problems with limited information. This framework, however, assumes that feedback can be observed just after choosing the action, and, hence, does not apply directly to many practical applications, in which the feedback can often only be obtained after a while. To cope with such situations, we consider problem settings in which the feedback can be observed $d$ rounds after the choice of an action, and propose an algorithm for which the expected regret is $\tilde{O}( \sqrt{m (m + d) T} )$, ignoring logarithmic factors in $m$ and $T$, where $m$ and $T$ denote the dimensionality of the action set and the number of rounds, respectively. This algorithm achieves nearly optimal performance, as we are able to show that arbitrary algorithms suffer the regret of $\Omega(\sqrt{m (m+d) T})$ in the worst case. To develop the algorithm, we introduce a technique we refer to as \textit{distribution truncation}, which plays an essential role in bounding the regret. We also apply our approach to cooperative bandits, as studied by Cesa-Bianchi et al. [17] and Bar-On and Mansour [12], and extend their results to the linear bandits setting.
Author Information
Shinji Ito (NEC Corporation)
Daisuke Hatano (RIKEN AIP)
Hanna Sumita (Tokyo Institute of Technology)
Kei Takemura (NEC Corporation)
Takuro Fukunaga (Chuo University)
Naonori Kakimura (Keio University)
Ken-Ichi Kawarabayashi (National Institute of Informatics)
Related Events (a corresponding poster, oral, or spotlight)
-
2020 Poster: Delay and Cooperation in Nonstochastic Linear Bandits »
Thu. Dec 10th 05:00 -- 07:00 AM Room Poster Session 4 #1281
More from the Same Authors
-
2023 Poster: An Exploration-by-Optimization Approach to Best of Both Worlds in Linear Bandits »
Shinji Ito · Kei Takemura -
2023 Poster: Stability-penalty-adaptive follow-the-regularized-leader: Sparsity, game-dependency, and best-of-both-worlds »
Taira Tsuchiya · Shinji Ito · Junya Honda -
2023 Poster: Bandit Task Assignment with Unknown Processing Time »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2022 Poster: Average Sensitivity of Euclidean k-Clustering »
Yuichi Yoshida · Shinji Ito -
2022 Poster: Single Loop Gaussian Homotopy Method for Non-convex Optimization »
Hidenori Iwakiri · Yuhang Wang · Shinji Ito · Akiko Takeda -
2022 Poster: Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs »
Shinji Ito · Taira Tsuchiya · Junya Honda -
2021 Poster: On Optimal Robustness to Adversarial Corruption in Online Decision Problems »
Shinji Ito -
2021 Poster: Hybrid Regret Bounds for Combinatorial Semi-Bandits and Adversarial Linear Bandits »
Shinji Ito -
2020 Poster: Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits »
Shinji Ito · Shuichi Hirahara · Tasuku Soma · Yuichi Yoshida -
2020 Poster: A Tight Lower Bound and Efficient Reduction for Swap Regret »
Shinji Ito -
2020 Spotlight: A Tight Lower Bound and Efficient Reduction for Swap Regret »
Shinji Ito -
2020 Spotlight: Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits »
Shinji Ito · Shuichi Hirahara · Tasuku Soma · Yuichi Yoshida -
2019 Poster: Improved Regret Bounds for Bandit Combinatorial Optimization »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2019 Poster: Submodular Function Minimization with Noisy Evaluation Oracle »
Shinji Ito -
2019 Poster: Oracle-Efficient Algorithms for Online Linear Optimization with Bandit Feedback »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2018 Poster: Regret Bounds for Online Portfolio Selection with a Cardinality Constraint »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Akihiro Yabe · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2017 Poster: Efficient Sublinear-Regret Algorithms for Online Sparse Linear Regression with Limited Observation »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Akihiro Yabe · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi