Timezone: »
Poster
Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits
Lilian Besson · Emilie Kaufmann · Odalric-Ambrym Maillard · Julien Seznec
We introduce GLRklUCB, a novel algorithm for the piecewise iid non-stationary bandit problem with bounded rewards. This algorithm combines an efficient bandit algorithm, klUCB, with an efficient, parameter-free, change-point detector, the Bernoulli Generalized Likelihood Ratio Test, for which we provide new theoretical guarantees of independent interest. Unlike previous non-stationary bandit algorithms using a change-point detector, GLRklUCB does not need to be calibrated based on prior knowledge on the arms' means. We prove that this algorithm can attain a $O(\sqrt{TA\Upsilon_T\log(T)})$ regret in $T$ rounds on some ``easy'' instances in which there is sufficient delay between two change-points, where $A$ is the number of arms and $\Upsilon_T$ the number of change-points, without prior knowledge of $\Upsilon_T$. In contrast with recently proposed algorithms that are agnostic to $\Upsilon_T$, we perform a numerical study showing that GLRklUCB is also very efficient in practice, beyond easy instances.
Author Information
Lilian Besson
Emilie Kaufmann (CNRS)
Odalric-Ambrym Maillard (INRIA Lille Nord Europe)
Julien Seznec
More from the Same Authors
-
2021 Spotlight: Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits »
Reda Ouhamma · Odalric-Ambrym Maillard · Vianney Perchet -
2023 Poster: Adaptive Algorithms for Relaxed Pareto Set Identification »
Cyrille KONE · Emilie Kaufmann · Laura Richert -
2023 Poster: Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits »
Dorian Baudry · Fabien Pesquerel · Rémy Degenne · Odalric-Ambrym Maillard -
2023 Poster: An $\varepsilon$-Best-Arm Identification Algorithm for Fixed-Confidence and Beyond »
Marc Jourdan · Rémy Degenne · Emilie Kaufmann -
2022 Poster: Top Two Algorithms Revisited »
Marc Jourdan · Rémy Degenne · Dorian Baudry · Rianne de Heide · Emilie Kaufmann -
2022 Poster: IMED-RL: Regret optimal learning of ergodic Markov decision processes »
Fabien Pesquerel · Odalric-Ambrym Maillard -
2022 Poster: Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs »
Andrea Tirinzoni · Aymen Al Marjani · Emilie Kaufmann -
2022 Poster: Near-Optimal Collaborative Learning in Bandits »
Clémence Réda · Sattar Vakili · Emilie Kaufmann -
2021 Poster: Stochastic bandits with groups of similar arms. »
Fabien Pesquerel · Hassan SABER · Odalric-Ambrym Maillard -
2021 Poster: Indexed Minimum Empirical Divergence for Unimodal Bandits »
Hassan SABER · Pierre Ménard · Odalric-Ambrym Maillard -
2021 Poster: Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge »
Reda Ouhamma · Odalric-Ambrym Maillard · Vianney Perchet -
2021 Poster: From Optimality to Robustness: Adaptive Re-Sampling Strategies in Stochastic Bandits »
Dorian Baudry · Patrick Saux · Odalric-Ambrym Maillard -
2021 Poster: Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits »
Reda Ouhamma · Odalric-Ambrym Maillard · Vianney Perchet -
2020 Poster: Sub-sampling for Efficient Non-Parametric Bandit Exploration »
Dorian Baudry · Emilie Kaufmann · Odalric-Ambrym Maillard -
2020 Spotlight: Sub-sampling for Efficient Non-Parametric Bandit Exploration »
Dorian Baudry · Emilie Kaufmann · Odalric-Ambrym Maillard -
2020 Poster: Planning in Markov Decision Processes with Gap-Dependent Sample Complexity »
Anders Jonsson · Emilie Kaufmann · Pierre Menard · Omar Darwiche Domingues · Edouard Leurent · Michal Valko