Timezone: »
Poster
Fighting Bandits with a New Kind of Smoothness
Jacob D Abernethy · Chansoo Lee · Ambuj Tewari
We focus on the adversarial multi-armed bandit problem. The EXP3 algorithm of Auer et al. (2003) was shown to have a regret bound of $O(\sqrt{T N \log N})$, where $T$ is the time horizon and $N$ is the number of available actions (arms). More recently, Audibert and Bubeck (2009) improved the bound by a logarithmic factor via an entirely different method. In the present work, we provide a new set of analysis tools, using the notion of convex smoothing, to provide several novel algorithms with optimal guarantees. First we show that regularization via the Tsallis entropy matches the minimax rate of Audibert and Bubeck (2009) with an even tighter constant; it also fully generalizes EXP3. Second we show that a wide class of perturbation methods lead to near-optimal bandit algorithms as long as a simple condition on the perturbation distribution $\mathcal{D}$ is met: one needs that the hazard function of $\mathcal{D}$ remain bounded. The Gumbel, Weibull, Frechet, Pareto, and Gamma distributions all satisfy this key property; interestingly, the Gaussian and Uniform distributions do not.
Author Information
Jacob D Abernethy (University of Michigan)
Chansoo Lee (University of Michigan)
Ambuj Tewari (University of Michigan)
More from the Same Authors
-
2021 Spotlight: Representation Learning Beyond Linear Prediction Functions »
Ziping Xu · Ambuj Tewari -
2022 : RL Boltzmann Generators for Conformer Generation in Data-Sparse Environments »
Yash Patel · Ambuj Tewari -
2022 : Probabilistically Robust PAC Learning »
Vinod Raman · Ambuj Tewari · UNIQUE SUBEDI -
2022 Poster: Adaptive Sampling for Discovery »
Ziping Xu · Eunjae Shim · Ambuj Tewari · Paul Zimmerman -
2022 Poster: Online Agnostic Multiclass Boosting »
Vinod Raman · Ambuj Tewari -
2021 Poster: Representation Learning Beyond Linear Prediction Functions »
Ziping Xu · Ambuj Tewari -
2021 Poster: Causal Bandits with Unknown Graph Structure »
Yangyi Lu · Amirhossein Meisami · Ambuj Tewari -
2020 Poster: TorsionNet: A Reinforcement Learning Approach to Sequential Conformer Search »
Tarun Gogineni · Ziping Xu · Exequiel Punzalan · Runxuan Jiang · Joshua Kammeraad · Ambuj Tewari · Paul Zimmerman -
2020 Poster: Reinforcement Learning in Factored MDPs: Oracle-Efficient Algorithms and Tighter Regret Bounds for the Non-Episodic Setting »
Ziping Xu · Ambuj Tewari -
2020 Poster: On the Equivalence between Online and Private Learnability beyond Binary Classification »
Young H Jung · Baekjin Kim · Ambuj Tewari -
2020 Spotlight: On the Equivalence between Online and Private Learnability beyond Binary Classification »
Young H Jung · Baekjin Kim · Ambuj Tewari -
2019 : Poster and Coffee Break 1 »
Aaron Sidford · Aditya Mahajan · Alejandro Ribeiro · Alex Lewandowski · Ali H Sayed · Ambuj Tewari · Angelika Steger · Anima Anandkumar · Asier Mujika · Hilbert J Kappen · Bolei Zhou · Byron Boots · Chelsea Finn · Chen-Yu Wei · Chi Jin · Ching-An Cheng · Christina Yu · Clement Gehring · Craig Boutilier · Dahua Lin · Daniel McNamee · Daniel Russo · David Brandfonbrener · Denny Zhou · Devesh Jha · Diego Romeres · Doina Precup · Dominik Thalmeier · Eduard Gorbunov · Elad Hazan · Elena Smirnova · Elvis Dohmatob · Emma Brunskill · Enrique Munoz de Cote · Ethan Waldie · Florian Meier · Florian Schaefer · Ge Liu · Gergely Neu · Haim Kaplan · Hao Sun · Hengshuai Yao · Jalaj Bhandari · James A Preiss · Jayakumar Subramanian · Jiajin Li · Jieping Ye · Jimmy Smith · Joan Bas Serrano · Joan Bruna · John Langford · Jonathan Lee · Jose A. Arjona-Medina · Kaiqing Zhang · Karan Singh · Yuping Luo · Zafarali Ahmed · Zaiwei Chen · Zhaoran Wang · Zhizhong Li · Zhuoran Yang · Ziping Xu · Ziyang Tang · Yi Mao · David Brandfonbrener · Shirli Di-Castro · Riashat Islam · Zuyue Fu · Abhishek Naik · Saurabh Kumar · Benjamin Petit · Angeliki Kamoutsi · Simone Totaro · Arvind Raghunathan · Rui Wu · Donghwan Lee · Dongsheng Ding · Alec Koppel · Hao Sun · Christian Tjandraatmadja · Mahdi Karami · Jincheng Mei · Chenjun Xiao · Junfeng Wen · Zichen Zhang · Ross Goroshin · Mohammad Pezeshki · Jiaqi Zhai · Philip Amortila · Shuo Huang · Mariya Vasileva · El houcine Bergou · Adel Ahmadyan · Haoran Sun · Sheng Zhang · Lukas Gruber · Yuanhao Wang · Tetiana Parshakova -
2019 Poster: Generalization Bounds in the Predict-then-Optimize Framework »
Othman El Balghiti · Adam N. Elmachtoub · Paul Grigas · Ambuj Tewari -
2019 Poster: Online Learning via the Differential Privacy Lens »
Jacob Abernethy · Young H Jung · Chansoo Lee · Audra McMillan · Ambuj Tewari -
2019 Spotlight: Online Learning via the Differential Privacy Lens »
Jacob Abernethy · Young H Jung · Chansoo Lee · Audra McMillan · Ambuj Tewari -
2019 Poster: Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems »
Young H Jung · Ambuj Tewari -
2019 Poster: On the Optimality of Perturbations in Stochastic and Adversarial Multi-armed Bandit Problems »
Baekjin Kim · Ambuj Tewari -
2018 Workshop: CiML 2018 - Machine Learning competitions "in the wild": Playing in the real world or in real time »
Isabelle Guyon · Evelyne Viegas · Sergio Escalera · Jacob D Abernethy -
2018 : Building Algorithms by Playing Games »
Jacob D Abernethy -
2018 Poster: Active Learning for Non-Parametric Regression Using Purely Random Trees »
Jack Goetz · Ambuj Tewari · Paul Zimmerman -
2018 Poster: But How Does It Work in Theory? Linear SVM with Random Features »
Yitong Sun · Anna Gilbert · Ambuj Tewari -
2017 Workshop: Machine Learning Challenges as a Research Tool »
Isabelle Guyon · Evelyne Viegas · Sergio Escalera · Jacob D Abernethy -
2017 Poster: Action Centered Contextual Bandits »
Kristjan Greenewald · Ambuj Tewari · Susan Murphy · Predag Klasnja -
2017 Poster: Online multiclass boosting »
Young H Jung · Jack Goetz · Ambuj Tewari -
2017 Poster: On Frank-Wolfe and Equilibrium Computation »
Jacob D Abernethy · Jun-Kun Wang -
2017 Spotlight: On Frank-Wolfe and Equilibrium Computation »
Jacob D Abernethy · Jun-Kun Wang -
2016 Poster: Hardness of Online Sleeping Combinatorial Optimization Problems »
Satyen Kale · Chansoo Lee · David Pal -
2016 Poster: Phased Exploration with Greedy Exploitation in Stochastic Combinatorial Partial Monitoring Games »
Sougata Chaudhuri · Ambuj Tewari -
2016 Poster: Threshold Bandits, With and Without Censored Feedback »
Jacob D Abernethy · Kareem Amin · Ruihao Zhu -
2015 Poster: Predtron: A Family of Online Algorithms for General Prediction Problems »
Prateek Jain · Nagarajan Natarajan · Ambuj Tewari -
2015 Poster: A Market Framework for Eliciting Private Data »
Bo Waggoner · Rafael Frongillo · Jacob D Abernethy -
2015 Poster: Alternating Minimization for Regression Problems with Vector-valued Outputs »
Prateek Jain · Ambuj Tewari -
2014 Workshop: Large-scale reinforcement learning and Markov decision problems »
Benjamin Van Roy · Mohammad Ghavamzadeh · Peter Bartlett · Yasin Abbasi Yadkori · Ambuj Tewari -
2014 Workshop: NIPS Workshop on Transactional Machine Learning and E-Commerce »
David Parkes · David H Wolpert · Jennifer Wortman Vaughan · Jacob D Abernethy · Amos Storkey · Mark Reid · Ping Jin · Nihar Bhadresh Shah · Mehryar Mohri · Luis E Ortiz · Robin Hanson · Aaron Roth · Satyen Kale · Sebastien Lahaie -
2014 Poster: On Iterative Hard Thresholding Methods for High-dimensional M-Estimation »
Prateek Jain · Ambuj Tewari · Purushottam Kar -
2013 Poster: Minimax Optimal Algorithms for Unconstrained Linear Optimization »
Brendan McMahan · Jacob D Abernethy -
2013 Poster: Convex Calibrated Surrogates for Low-Rank Loss Matrices with Applications to Subset Ranking Losses »
Harish G Ramaswamy · Shivani Agarwal · Ambuj Tewari -
2013 Poster: Adaptive Market Making via Online Learning »
Jacob D Abernethy · Satyen Kale -
2013 Poster: How to Hedge an Option Against an Adversary: Black-Scholes Pricing is Minimax Optimal »
Jacob D Abernethy · Peter Bartlett · Rafael Frongillo · Andre Wibisono -
2013 Spotlight: How to Hedge an Option Against an Adversary: Black-Scholes Pricing is Minimax Optimal »
Jacob D Abernethy · Peter Bartlett · Rafael Frongillo · Andre Wibisono -
2013 Oral: Adaptive Market Making via Online Learning »
Jacob D Abernethy · Satyen Kale -
2013 Spotlight: Convex Calibrated Surrogates for Low-Rank Loss Matrices with Applications to Subset Ranking Losses »
Harish G Ramaswamy · Shivani Agarwal · Ambuj Tewari -
2013 Poster: Learning with Noisy Labels »
Nagarajan Natarajan · Inderjit Dhillon · Pradeep Ravikumar · Ambuj Tewari -
2012 Poster: Feature Clustering for Accelerating Parallel Coordinate Descent »
Chad Scherrer · Ambuj Tewari · Mahantesh Halappanavar · David Haglin -
2011 Poster: A Collaborative Mechanism for Crowdsourcing Prediction Problems »
Jacob D Abernethy · Rafael Frongillo -
2011 Oral: A Collaborative Mechanism for Crowdsourcing Prediction Problems »
Jacob D Abernethy · Rafael Frongillo -
2011 Poster: Greedy Algorithms for Structurally Constrained High Dimensional Problems »
Ambuj Tewari · Pradeep Ravikumar · Inderjit Dhillon -
2011 Poster: On the Universality of Online Mirror Descent »
Nati Srebro · Karthik Sridharan · Ambuj Tewari -
2011 Poster: Nearest Neighbor based Greedy Coordinate Descent »
Inderjit Dhillon · Pradeep Ravikumar · Ambuj Tewari -
2011 Poster: Online Learning: Stochastic, Constrained, and Smoothed Adversaries »
Sasha Rakhlin · Karthik Sridharan · Ambuj Tewari -
2011 Poster: Orthogonal Matching Pursuit with Replacement »
Prateek Jain · Ambuj Tewari · Inderjit Dhillon -
2010 Oral: Online Learning: Random Averages, Combinatorial Parameters, and Learnability »
Sasha Rakhlin · Karthik Sridharan · Ambuj Tewari -
2010 Poster: Online Learning: Random Averages, Combinatorial Parameters, and Learnability »
Sasha Rakhlin · Karthik Sridharan · Ambuj Tewari -
2010 Poster: Smoothness, Low Noise and Fast Rates »
Nati Srebro · Karthik Sridharan · Ambuj Tewari -
2010 Poster: Repeated Games against Budgeted Adversaries »
Jacob D Abernethy · Manfred K. Warmuth -
2008 Poster: On the Generalization Ability of Online Strongly Convex Programming Algorithms »
Sham M Kakade · Ambuj Tewari -
2008 Spotlight: On the Generalization Ability of Online Strongly Convex Programming Algorithms »
Sham M Kakade · Ambuj Tewari -
2008 Poster: On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization »
Sham M Kakade · Karthik Sridharan · Ambuj Tewari -
2007 Poster: Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs »
Ambuj Tewari · Peter Bartlett -
2006 Poster: Sample Complexity of Policy Search with Known Dynamics »
Peter Bartlett · Ambuj Tewari