Timezone: »
Top two algorithms arose as an adaptation of Thompson sampling to best arm identification in multi-armed bandit models for parametric families of arms. They select the next arm to sample from by randomizing among two candidate arms, a leader and a challenger. Despite their good empirical performance, theoretical guarantees for fixed-confidence best arm identification have only been obtained when the arms are Gaussian with known variances. In this paper, we provide a general analysis of top-two methods, which identifies desirable properties of the leader, the challenger, and the (possibly non-parametric) distributions of the arms. As a result, we obtain theoretically supported top-two algorithms for best arm identification with bounded distributions. Our proof method demonstrates in particular that the sampling step used to select the leader inherited from Thompson sampling can be replaced by other choices, like selecting the empirical best arm.
Author Information
Marc Jourdan (INRIA)
Rémy Degenne (Inria)
Dorian Baudry (CNRS/Inria)
Rianne de Heide (Vrije Universiteit Amsterdam)
Emilie Kaufmann (CNRS)
More from the Same Authors
-
2023 Poster: Adaptive Algorithms for Relaxed Pareto Set Identification »
Cyrille Kone · Emilie Kaufmann · Laura Richert -
2023 Poster: Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits »
Dorian Baudry · Fabien Pesquerel · Rémy Degenne · Odalric-Ambrym Maillard -
2023 Poster: An $\varepsilon$-Best-Arm Identification Algorithm for Fixed-Confidence and Beyond »
Marc Jourdan · Rémy Degenne · Emilie Kaufmann -
2023 Poster: On the Complexity of Differentially Private Best-Arm Identification with Fixed Confidence »
Achraf Azize · Marc Jourdan · Aymen Al Marjani · Debabrota Basu -
2023 Poster: Non-Asymptotic Analysis of a UCB-based Top Two Algorithm »
Marc Jourdan · Rémy Degenne -
2022 Poster: Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs »
Andrea Tirinzoni · Aymen Al Marjani · Emilie Kaufmann -
2022 Poster: On Elimination Strategies for Bandit Fixed-Confidence Identification »
Andrea Tirinzoni · Rémy Degenne -
2022 Poster: Near-Optimal Collaborative Learning in Bandits »
Clémence Réda · Sattar Vakili · Emilie Kaufmann -
2022 Poster: Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits »
Lilian Besson · Emilie Kaufmann · Odalric-Ambrym Maillard · Julien Seznec -
2021 Poster: From Optimality to Robustness: Adaptive Re-Sampling Strategies in Stochastic Bandits »
Dorian Baudry · Patrick Saux · Odalric-Ambrym Maillard -
2020 Poster: Sub-sampling for Efficient Non-Parametric Bandit Exploration »
Dorian Baudry · Emilie Kaufmann · Odalric-Ambrym Maillard -
2020 Spotlight: Sub-sampling for Efficient Non-Parametric Bandit Exploration »
Dorian Baudry · Emilie Kaufmann · Odalric-Ambrym Maillard -
2020 Poster: Planning in Markov Decision Processes with Gap-Dependent Sample Complexity »
Anders Jonsson · Emilie Kaufmann · Pierre Menard · Omar Darwiche Domingues · Edouard Leurent · Michal Valko -
2019 Poster: Pure Exploration with Multiple Correct Answers »
Rémy Degenne · Wouter Koolen -
2019 Poster: Non-Asymptotic Pure Exploration by Solving Games »
Rémy Degenne · Wouter Koolen · Pierre Ménard -
2016 Poster: Combinatorial semi-bandit with known covariance »
Rémy Degenne · Vianney Perchet