Timezone: »
Poster
A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback
Saeed Masoudian · Julian Zimmert · Yevgeny Seldin
We present a modified tuning of the algorithm of Zimmert and Seldin [2020] for adversarial multiarmed bandits with delayed feedback, which in addition to the minimax optimal adversarial regret guarantee shown by Zimmert and Seldin [2020] simultaneously achieves a near-optimal regret guarantee in the stochastic setting with fixed delays. Specifically, the adversarial regret guarantee is $\mathcal{O}(\sqrt{TK} + \sqrt{dT\log K})$, where $T$ is the time horizon, $K$ is the number of arms, and $d$ is the fixed delay, whereas the stochastic regret guarantee is $\mathcal{O}\left(\sum_{i \neq i^*}(\frac{1}{\Delta_i} \log(T) + \frac{d}{\Delta_{i}}) + d K^{1/3}\log K\right)$, where $\Delta_i$ are the suboptimality gaps. We also present an extension of the algorithm to the case of arbitrary delays, which is based on an oracle knowledge of the maximal delay $d_{max}$ and achieves $\mathcal{O}(\sqrt{TK} + \sqrt{D\log K} + d_{max}K^{1/3} \log K)$ regret in the adversarial regime, where $D$ is the total delay, and $\mathcal{O}\left(\sum_{i \neq i^*}(\frac{1}{\Delta_i} \log(T) + \frac{\sigma_{max}}{\Delta_{i}}) + d_{max}K^{1/3}\log K\right)$ regret in the stochastic regime, where $\sigma_{max}$ is the maximal number of outstanding observations. Finally, we present a lower bound that matches regret upper bound achieved by the skipping technique of Zimmert and Seldin [2020] in the adversarial setting.
Author Information
Saeed Masoudian (University of Copenhagen)
Julian Zimmert (Google Research)
Yevgeny Seldin (University of Copenhagen)
More from the Same Authors
-
2021 Spotlight: Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning »
Christoph Dann · Teodor Vanislavov Marinov · Mehryar Mohri · Julian Zimmert -
2023 Poster: Bypassing the Simulator: Near-Optimal Adversarial Linear Contextual Bandits »
Haolin Liu · Chen-Yu Wei · Julian Zimmert -
2023 Poster: Optimal cross-learning for contextual bandits with unknown context distributions »
Jon Schneider · Julian Zimmert -
2022 Poster: A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs »
Chloé Rouyer · Dirk van der Hoeven · Nicolò Cesa-Bianchi · Yevgeny Seldin -
2022 Poster: Split-kl and PAC-Bayes-split-kl Inequalities for Ternary Random Variables »
Yi-Shan Wu · Yevgeny Seldin -
2022 Poster: Stochastic Online Learning with Feedback Graphs: Finite-Time and Asymptotic Optimality »
Teodor Vanislavov Marinov · Mehryar Mohri · Julian Zimmert -
2021 Poster: A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning »
Christoph Dann · Mehryar Mohri · Tong Zhang · Julian Zimmert -
2021 Poster: Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning »
Christoph Dann · Teodor Vanislavov Marinov · Mehryar Mohri · Julian Zimmert -
2021 Poster: Chebyshev-Cantelli PAC-Bayes-Bennett Inequality for the Weighted Majority Vote »
Yi-Shan Wu · Andres Masegosa · Stephan Lorenzen · Christian Igel · Yevgeny Seldin -
2021 Poster: The Pareto Frontier of model selection for general Contextual Bandits »
Teodor Vanislavov Marinov · Julian Zimmert -
2020 Poster: Model Selection in Contextual Stochastic Bandit Problems »
Aldo Pacchiano · My Phan · Yasin Abbasi Yadkori · Anup Rao · Julian Zimmert · Tor Lattimore · Csaba Szepesvari -
2020 Poster: Adapting to Misspecification in Contextual Bandits »
Dylan Foster · Claudio Gentile · Mehryar Mohri · Julian Zimmert -
2020 Poster: Second Order PAC-Bayesian Bounds for the Weighted Majority Vote »
Andres Masegosa · Stephan Lorenzen · Christian Igel · Yevgeny Seldin -
2020 Spotlight: Second Order PAC-Bayesian Bounds for the Weighted Majority Vote »
Andres Masegosa · Stephan Lorenzen · Christian Igel · Yevgeny Seldin -
2019 Poster: Nonstochastic Multiarmed Bandits with Unrestricted Delays »
Tobias Sommer Thune · Nicolò Cesa-Bianchi · Yevgeny Seldin -
2018 Poster: Factored Bandits »
Julian Zimmert · Yevgeny Seldin