Timezone: »

Poster
Better Best of Both Worlds Bounds for Bandits with Switching Costs
Idan Amir · Guy Azov · Tomer Koren · Roi Livni

Tue Nov 29 09:00 AM -- 11:00 AM (PST) @ Hall J #819
We study best-of-both-worlds algorithms for bandits with switching cost, recently addressed by Rouyer et al., 2021. We introduce a surprisingly simple and effective algorithm that simultaneously achieves minimax optimal regret bound (up to logarithmic factors) of $\mathcal{O}(T^{2/3})$ in the oblivious adversarial setting and a bound of $\mathcal{O}(\min\{\log (T)/\Delta^2,T^{2/3}\})$ in the stochastically-constrained regime, both with (unit) switching costs, where $\Delta$ is the gap between the arms. In the stochastically constrained case, our bound improves over previous results due to Rouyer et al., 2021, that achieved regret of $\mathcal{O}(T^{1/3}/\Delta)$. We accompany our results with a lower bound showing that, in general, $\tilde{\mathcal{\Omega}}(\min\{1/\Delta^2,T^{2/3}\})$ switching cost regret is unavoidable in the stochastically-constrained case for algorithms with $\mathcal{O}(T^{2/3})$ worst-case switching cost regret.

#### Author Information

##### Guy Azov (Tel Aviv University)

An MSc student at Tel-Aviv University under the Electrical Engineering faculty. My thesis is focusing on machine learning theory, especially Multi-Armed Bandits. I am also a research scientist at Intel's WCS (Wireless and Connectivity Solutions) Center of Excellence. I perform an end to end research and development of machine learning solutions to support the WCS organization at Intel. Hard worker, dedicated, eager to learn, and highly disciplined. Love sports, history, and traveling.