Timezone: »
In the fixed budget thresholding bandit problem, an algorithm sequentially allocates a budgeted number of samples to different distributions. It then predicts whether the mean of each distribution is larger or lower than a given threshold. We introduce a large family of algorithms (containing most existing relevant ones), inspired by the Frank-Wolfe algorithm, and provide a thorough yet generic analysis of their performance. This allowed us to construct new explicit algorithms, for a broad class of problems, whose losses are within a small constant factor of the non-adaptive oracle ones. Quite interestingly, we observed that adaptive methodsempirically greatly out-perform non-adaptive oracles, an uncommon behavior in standard online learning settings, such as regret minimization. We explain this surprising phenomenon on an insightful toy problem.
Author Information
Reda Ouhamma (Université de Lille)
Odalric-Ambrym Maillard (INRIA Lille Nord Europe)
Vianney Perchet (ENSAE & Criteo AI Lab)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits »
Tue. Dec 7th 04:30 -- 06:00 PM Room
More from the Same Authors
-
2021 Spotlight: Decentralized Learning in Online Queuing Systems »
Flore Sentenac · Etienne Boursier · Vianney Perchet -
2023 Poster: Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits »
Dorian Baudry · Fabien Pesquerel · Rémy Degenne · Odalric-Ambrym Maillard -
2022 Poster: IMED-RL: Regret optimal learning of ergodic Markov decision processes »
Fabien Pesquerel · Odalric-Ambrym Maillard -
2022 Poster: Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits »
Lilian Besson · Emilie Kaufmann · Odalric-Ambrym Maillard · Julien Seznec -
2021 Poster: Local Differential Privacy for Regret Minimization in Reinforcement Learning »
Evrard Garcelon · Vianney Perchet · Ciara Pike-Burke · Matteo Pirotta -
2021 Poster: Stochastic bandits with groups of similar arms. »
Fabien Pesquerel · Hassan SABER · Odalric-Ambrym Maillard -
2021 Poster: ROI Maximization in Stochastic Online Decision-Making »
Nicolò Cesa-Bianchi · Tom Cesari · Yishay Mansour · Vianney Perchet -
2021 Poster: Making the most of your day: online learning for optimal allocation of time »
Etienne Boursier · Tristan Garrec · Vianney Perchet · Marco Scarsini -
2021 Poster: Indexed Minimum Empirical Divergence for Unimodal Bandits »
Hassan SABER · Pierre Ménard · Odalric-Ambrym Maillard -
2021 Poster: Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge »
Reda Ouhamma · Odalric-Ambrym Maillard · Vianney Perchet -
2021 Poster: From Optimality to Robustness: Adaptive Re-Sampling Strategies in Stochastic Bandits »
Dorian Baudry · Patrick Saux · Odalric-Ambrym Maillard -
2021 Poster: Online Matching in Sparse Random Graphs: Non-Asymptotic Performances of Greedy Algorithm »
Nathan Noiry · Vianney Perchet · Flore Sentenac -
2021 Poster: Decentralized Learning in Online Queuing Systems »
Flore Sentenac · Etienne Boursier · Vianney Perchet -
2017 Poster: Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe »
Quentin Berthet · Vianney Perchet -
2017 Spotlight: Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe »
Quentin Berthet · Vianney Perchet