Timezone: »
Poster
A Scale Free Algorithm for Stochastic Bandits with Bounded Kurtosis
Tor Lattimore
Existing strategies for finite-armed stochastic bandits mostly depend on a parameter of scale that must be known in advance. Sometimes this is in the form of a bound on the payoffs, or the knowledge of a variance or subgaussian parameter. The notable exceptions are the analysis of Gaussian bandits with unknown mean and variance by Cowan and Katehakis [2015a] and of uniform distributions with unknown support [Cowan and Katehakis, 2015b]. The results derived in these specialised cases are generalised here to the non-parametric setup, where the learner knows only a bound on the kurtosis of the noise, which is a scale free measure of the extremity of outliers.
Author Information
Tor Lattimore (DeepMind)
More from the Same Authors
-
2022 Poster: Regret Bounds for Information-Directed Reinforcement Learning »
Botao Hao · Tor Lattimore -
2018 Poster: TopRank: A practical algorithm for online stochastic ranking »
Tor Lattimore · Branislav Kveton · Shuai Li · Csaba Szepesvari -
2018 Poster: Single-Agent Policy Tree Search With Guarantees »
Laurent Orseau · Levi Lelis · Tor Lattimore · Theophane Weber -
2017 Poster: Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning »
Christoph Dann · Tor Lattimore · Emma Brunskill -
2017 Spotlight: Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning »
Christoph Dann · Tor Lattimore · Emma Brunskill -
2016 Poster: Refined Lower Bounds for Adversarial Bandits »
Sébastien Gerchinovitz · Tor Lattimore -
2016 Poster: Causal Bandits: Learning Good Interventions via Causal Inference »
Finnian Lattimore · Tor Lattimore · Mark Reid -
2016 Poster: Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities »
Ruitong Huang · Tor Lattimore · András György · Csaba Szepesvari -
2016 Poster: On Explore-Then-Commit strategies »
Aurélien Garivier · Tor Lattimore · Emilie Kaufmann -
2015 Poster: The Pareto Regret Frontier for Bandits »
Tor Lattimore -
2015 Poster: Linear Multi-Resource Allocation with Semi-Bandit Feedback »
Tor Lattimore · Yacov Crammer · Csaba Szepesvari -
2014 Poster: Bounded Regret for Finite-Armed Structured Bandits »
Tor Lattimore · Remi Munos