Timezone: »
Poster
Oracle-Efficient Algorithms for Online Linear Optimization with Bandit Feedback
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi
Tue Dec 10 05:30 PM -- 07:30 PM (PST) @ East Exhibition Hall B + C #20
We propose computationally efficient algorithms for \textit{online linear optimization with bandit feedback}, in which a player chooses an \textit{action vector} from a given (possibly infinite) set $\mathcal{A} \subseteq \mathbb{R}^d$, and then suffers a loss that can be expressed as a linear function in action vectors. Although existing algorithms achieve an optimal regret bound of $\tilde{O}(\sqrt{T})$ for $T$ rounds (ignoring factors of $\mathrm{poly} (d, \log T)$), computationally efficient ways of implementing them have not yet been specified, in particular when $|\mathcal{A}|$ is not bounded by a polynomial size in $d$. A standard way to pursue computational efficiency is to assume that we have an efficient algorithm referred to as \textit{oracle} that solves (offline) linear optimization problems over $\mathcal{A}$. Under this assumption, the computational efficiency of a bandit algorithm can then be measured in terms of \textit{oracle complexity}, i.e., the number of oracle calls. Our contribution is to propose algorithms that offer optimal regret bounds of $\tilde{O}(\sqrt{T})$ as well as low oracle complexity for both \textit{non-stochastic settings} and \textit{stochastic settings}. Our algorithm for non-stochastic settings has an oracle complexity of $\tilde{O}( T )$ and is the first algorithm that achieves both a regret bound of $\tilde{O}( \sqrt{T} )$ and an oracle complexity of $\tilde{O} ( \mathrm{poly} ( T ) )$, given only linear optimization oracles. Our algorithm for stochastic settings calls the oracle only $O( \mathrm{poly} (d, \log T))$ times, which is smaller than the current best oracle complexity of $O( T )$ if $T$ is sufficiently large.
Author Information
Shinji Ito (NEC Corporation)
Daisuke Hatano (RIKEN AIP)
Hanna Sumita (Tokyo Metropolitan University)
Kei Takemura (NEC Corporation)
Takuro Fukunaga (Chuo University, JST PRESTO, RIKEN AIP)
Naonori Kakimura (Keio University)
Ken-Ichi Kawarabayashi (National Institute of Informatics)
More from the Same Authors
-
2023 Poster: An Exploration-by-Optimization Approach to Best of Both Worlds in Linear Bandits »
Shinji Ito · Kei Takemura -
2023 Poster: Stability-penalty-adaptive follow-the-regularized-leader: Sparsity, game-dependency, and best-of-both-worlds »
Taira Tsuchiya · Shinji Ito · Junya Honda -
2023 Poster: Bandit Task Assignment with Unknown Processing Time »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2022 Poster: Average Sensitivity of Euclidean k-Clustering »
Yuichi Yoshida · Shinji Ito -
2022 Poster: Single Loop Gaussian Homotopy Method for Non-convex Optimization »
Hidenori Iwakiri · Yuhang Wang · Shinji Ito · Akiko Takeda -
2022 Poster: Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs »
Shinji Ito · Taira Tsuchiya · Junya Honda -
2021 Poster: On Optimal Robustness to Adversarial Corruption in Online Decision Problems »
Shinji Ito -
2021 Poster: Hybrid Regret Bounds for Combinatorial Semi-Bandits and Adversarial Linear Bandits »
Shinji Ito -
2020 Poster: Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits »
Shinji Ito · Shuichi Hirahara · Tasuku Soma · Yuichi Yoshida -
2020 Poster: A Tight Lower Bound and Efficient Reduction for Swap Regret »
Shinji Ito -
2020 Poster: Delay and Cooperation in Nonstochastic Linear Bandits »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2020 Spotlight: A Tight Lower Bound and Efficient Reduction for Swap Regret »
Shinji Ito -
2020 Spotlight: Delay and Cooperation in Nonstochastic Linear Bandits »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2020 Spotlight: Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits »
Shinji Ito · Shuichi Hirahara · Tasuku Soma · Yuichi Yoshida -
2019 Poster: Improved Regret Bounds for Bandit Combinatorial Optimization »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Kei Takemura · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2019 Poster: Submodular Function Minimization with Noisy Evaluation Oracle »
Shinji Ito -
2018 Poster: Regret Bounds for Online Portfolio Selection with a Cardinality Constraint »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Akihiro Yabe · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi -
2017 Poster: Efficient Sublinear-Regret Algorithms for Online Sparse Linear Regression with Limited Observation »
Shinji Ito · Daisuke Hatano · Hanna Sumita · Akihiro Yabe · Takuro Fukunaga · Naonori Kakimura · Ken-Ichi Kawarabayashi