Timezone: »
Poster
Adapting to Misspecification in Contextual Bandits
Dylan Foster · Claudio Gentile · Mehryar Mohri · Julian Zimmert
A major research direction in contextual bandits is to
develop algorithms that are computationally efficient, yet support
flexible, general-purpose function approximation. Algorithms based on
modeling rewards have shown strong empirical performance, yet
typically require a well-specified model, and can fail when this
assumption does not hold. Can we design algorithms that are efficient
and flexible, yet degrade gracefully in the face of model
misspecification? We introduce a new family of oracle-efficient algorithms for $\varepsilon$-misspecified contextual bandits
that adapt to unknown model misspecification---both for finite and
infinite action settings. Given access to an
\emph{online oracle} for square loss regression, our algorithm attains
optimal regret and---in particular---optimal dependence on the
misspecification level, with \emph{no prior knowledge}. Specializing
to linear contextual bandits with infinite actions in $d$ dimensions,
we obtain the first algorithm that achieves the optimal
$\bigoht(d\sqrt{T} + \varepsilon\sqrt{d}T)$ regret bound for unknown $\varepsilon$.
On a conceptual level, our results are enabled by a new
optimization-based perspective on the regression oracle reduction framework of
Foster and Rakhlin (2020), which we believe will be useful more broadly.
Author Information
Dylan Foster (MIT)
Claudio Gentile (Google Research)
Mehryar Mohri (Google Research & Courant Institute of Mathematical Sciences)
Julian Zimmert (Google)
More from the Same Authors
-
2021 Spotlight: Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations »
Ayush Sekhari · Christoph Dann · Mehryar Mohri · Yishay Mansour · Karthik Sridharan -
2021 Spotlight: On the Existence of The Adversarial Bayes Classifier »
Pranjal Awasthi · Natalie Frank · Mehryar Mohri -
2021 Spotlight: Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning »
Christoph Dann · Teodor Vanislavov Marinov · Mehryar Mohri · Julian Zimmert -
2021 Spotlight: Calibration and Consistency of Adversarial Surrogate Losses »
Pranjal Awasthi · Natalie Frank · Anqi Mao · Mehryar Mohri · Yutao Zhong -
2022 : A Theory of Learning with Competing Objectives and User Feedback »
Pranjal Awasthi · Corinna Cortes · Yishay Mansour · Mehryar Mohri -
2022 : A Contextual Bandit Approach for Learning to Plan in Environments with Probabilistic Goal Configurations »
Sohan Rudra · Saksham Goel · Anirban Santara · Claudio Gentile · Laurent Perron · Fei Xia · Vikas Sindhwani · Carolina Parada · Gaurav Aggarwal -
2022 : AdaME: Adaptive learning of multisource adaptationensembles »
Scott Yak · Javier Gonzalvo · Mehryar Mohri · Corinna Cortes -
2022 : A Theory of Learning with Competing Objectives and User Feedback »
Pranjal Awasthi · Corinna Cortes · Yishay Mansour · Mehryar Mohri -
2022 : A Contextual Bandit Approach for Learning to Plan in Environments with Probabilistic Goal Configurations »
Sohan Rudra · Saksham Goel · Anirban Santara · Claudio Gentile · Laurent Perron · Fei Xia · Vikas Sindhwani · Carolina Parada · Gaurav Aggarwal -
2022 Spotlight: Lightning Talks 6A-2 »
Yichuan Mo · Botao Yu · Gang Li · Zezhong Xu · Haoran Wei · Arsene Fansi Tchango · Raef Bassily · Haoyu Lu · Qi Zhang · Songming Liu · Mingyu Ding · Peiling Lu · Yifei Wang · Xiang Li · Dongxian Wu · Ping Guo · Wen Zhang · Hao Zhongkai · Mehryar Mohri · Rishab Goel · Yisen Wang · Yifei Wang · Yangguang Zhu · Zhi Wen · Ananda Theertha Suresh · Chengyang Ying · Yujie Wang · Peng Ye · Rui Wang · Nanyi Fei · Hui Chen · Yiwen Guo · Wei Hu · Chenglong Liu · Julien Martel · Yuqi Huo · Wu Yichao · Hang Su · Yisen Wang · Peng Wang · Huajun Chen · Xu Tan · Jun Zhu · Ding Liang · Zhiwu Lu · Joumana Ghosn · Shanshan Zhang · Wei Ye · Ze Cheng · Shikun Zhang · Tao Qin · Tie-Yan Liu -
2022 Spotlight: Differentially Private Learning with Margin Guarantees »
Raef Bassily · Mehryar Mohri · Ananda Theertha Suresh -
2022 : A Theory of Learning with Competing Objectives and User Feedback »
Pranjal Awasthi · Corinna Cortes · Yishay Mansour · Mehryar Mohri -
2022 : Invited Talk #1, Differentially Private Learning with Margin Guarantees, Mehryar Mohri »
Mehryar Mohri -
2022 Poster: Interaction-Grounded Learning with Action-Inclusive Feedback »
Tengyang Xie · Akanksha Saran · Dylan J Foster · Lekan Molu · Ida Momennejad · Nan Jiang · Paul Mineiro · John Langford -
2022 Poster: Multi-Class $H$-Consistency Bounds »
Pranjal Awasthi · Anqi Mao · Mehryar Mohri · Yutao Zhong -
2022 Poster: Stochastic Online Learning with Feedback Graphs: Finite-Time and Asymptotic Optimality »
Teodor Vanislavov Marinov · Mehryar Mohri · Julian Zimmert -
2022 Poster: A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback »
Saeed Masoudian · Julian Zimmert · Yevgeny Seldin -
2022 Poster: Differentially Private Learning with Margin Guarantees »
Raef Bassily · Mehryar Mohri · Ananda Theertha Suresh -
2022 Poster: Understanding the Eluder Dimension »
Gene Li · Pritish Kamath · Dylan J Foster · Nati Srebro -
2022 Poster: On the Complexity of Adversarial Decision Making »
Dylan J Foster · Alexander Rakhlin · Ayush Sekhari · Karthik Sridharan -
2021 Poster: A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning »
Christoph Dann · Mehryar Mohri · Tong Zhang · Julian Zimmert -
2021 Poster: On the Existence of The Adversarial Bayes Classifier »
Pranjal Awasthi · Natalie Frank · Mehryar Mohri -
2021 Poster: Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning »
Christoph Dann · Teodor Vanislavov Marinov · Mehryar Mohri · Julian Zimmert -
2021 Poster: Learning with User-Level Privacy »
Daniel Levy · Ziteng Sun · Kareem Amin · Satyen Kale · Alex Kulesza · Mehryar Mohri · Ananda Theertha Suresh -
2021 Poster: Boosting with Multiple Sources »
Corinna Cortes · Mehryar Mohri · Dmitry Storcheus · Ananda Theertha Suresh -
2021 Poster: Breaking the centralized barrier for cross-device federated learning »
Sai Praneeth Karimireddy · Martin Jaggi · Satyen Kale · Mehryar Mohri · Sashank Reddi · Sebastian Stich · Ananda Theertha Suresh -
2021 Poster: Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations »
Ayush Sekhari · Christoph Dann · Mehryar Mohri · Yishay Mansour · Karthik Sridharan -
2021 Oral: Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination »
Dylan Foster · Akshay Krishnamurthy -
2021 Poster: Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination »
Dylan Foster · Akshay Krishnamurthy -
2021 Poster: The Pareto Frontier of model selection for general Contextual Bandits »
Teodor Vanislavov Marinov · Julian Zimmert -
2021 Poster: Calibration and Consistency of Adversarial Surrogate Losses »
Pranjal Awasthi · Natalie Frank · Anqi Mao · Mehryar Mohri · Yutao Zhong -
2020 Poster: Model Selection in Contextual Stochastic Bandit Problems »
Aldo Pacchiano · My Phan · Yasin Abbasi Yadkori · Anup Rao · Julian Zimmert · Tor Lattimore · Csaba Szepesvari -
2020 Poster: Agnostic Learning with Multiple Objectives »
Corinna Cortes · Mehryar Mohri · Javier Gonzalvo · Dmitry Storcheus -
2020 Poster: Reinforcement Learning with Feedback Graphs »
Christoph Dann · Yishay Mansour · Mehryar Mohri · Ayush Sekhari · Karthik Sridharan -
2020 Poster: Learning the Linear Quadratic Regulator from Nonlinear Observations »
Zakaria Mhammedi · Dylan Foster · Max Simchowitz · Dipendra Misra · Wen Sun · Akshay Krishnamurthy · Alexander Rakhlin · John Langford -
2020 Poster: PAC-Bayes Learning Bounds for Sample-Dependent Priors »
Pranjal Awasthi · Satyen Kale · Stefani Karp · Mehryar Mohri -
2020 Session: Orals & Spotlights Track 11: Learning Theory »
Dylan Foster · Nicolò Cesa-Bianchi -
2020 Poster: Independent Policy Gradient Methods for Competitive Reinforcement Learning »
Constantinos Daskalakis · Dylan Foster · Noah Golowich -
2020 : Real World RL with Vowpal Wabbit: Beyond Contextual Bandits »
John Langford · Marek Wydmuch · Maryam Majzoubi · Adith Swaminathan · · Dylan Foster · Paul Mineiro -
2019 : Mehryar Mohri, "Learning with Sample-Dependent Hypothesis Sets" »
Mehryar Mohri -
2019 : Poster Session »
Gergely Flamich · Shashanka Ubaru · Charles Zheng · Josip Djolonga · Kristoffer Wickstrøm · Diego Granziol · Konstantinos Pitas · Jun Li · Robert Williamson · Sangwoong Yoon · Kwot Sin Lee · Julian Zilly · Linda Petrini · Ian Fischer · Zhe Dong · Alexander Alemi · Bao-Ngoc Nguyen · Rob Brekelmans · Tailin Wu · Aditya Mahajan · Alexander Li · Kirankumar Shiragur · Yair Carmon · Linara Adilova · SHIYU LIU · Bang An · Sanjeeb Dash · Oktay Gunluk · Arya Mazumdar · Mehul Motani · Julia Rosenzweig · Michael Kamp · Marton Havasi · Leighton P Barnes · Zhengqing Zhou · Yi Hao · Dylan Foster · Yuval Benjamini · Nati Srebro · Michael Tschannen · Paul Rubenstein · Sylvain Gelly · John Duchi · Aaron Sidford · Robin Ru · Stefan Zohren · Murtaza Dalal · Michael A Osborne · Stephen J Roberts · Moses Charikar · Jayakumar Subramanian · Xiaodi Fan · Max Schwarzer · Nicholas Roberts · Simon Lacoste-Julien · Vinay Prabhu · Aram Galstyan · Greg Ver Steeg · Lalitha Sankar · Yung-Kyun Noh · Gautam Dasarathy · Frank Park · Ngai-Man (Man) Cheung · Ngoc-Trung Tran · Linxiao Yang · Ben Poole · Andrea Censi · Tristan Sylvain · R Devon Hjelm · Bangjie Liu · Jose Gallego-Posada · Tyler Sypherd · Kai Yang · Jan Nikolas Morshuis -
2019 Poster: Learning GANs and Ensembles Using Discrepancy »
Ben Adlam · Corinna Cortes · Mehryar Mohri · Ningshan Zhang -
2019 Poster: Bandits with Feedback Graphs and Switching Costs »
Raman Arora · Teodor Vanislavov Marinov · Mehryar Mohri -
2019 Poster: Model Selection for Contextual Bandits »
Dylan Foster · Akshay Krishnamurthy · Haipeng Luo -
2019 Spotlight: Model Selection for Contextual Bandits »
Dylan Foster · Akshay Krishnamurthy · Haipeng Luo -
2019 Poster: Regularized Gradient Boosting »
Corinna Cortes · Mehryar Mohri · Dmitry Storcheus -
2019 Poster: Hypothesis Set Stability and Generalization »
Dylan Foster · Spencer Greenberg · Satyen Kale · Haipeng Luo · Mehryar Mohri · Karthik Sridharan -
2018 Poster: Policy Regret in Repeated Games »
Raman Arora · Michael Dinitz · Teodor Vanislavov Marinov · Mehryar Mohri -
2018 Poster: Contextual bandits with surrogate losses: Margin bounds and efficient algorithms »
Dylan Foster · Akshay Krishnamurthy -
2018 Poster: Factored Bandits »
Julian Zimmert · Yevgeny Seldin -
2018 Poster: Efficient Gradient Computation for Structured Output Learning with Rational and Tropical Losses »
Corinna Cortes · Vitaly Kuznetsov · Mehryar Mohri · Dmitry Storcheus · Scott Yang -
2018 Poster: Algorithms and Theory for Multiple-Source Adaptation »
Judy Hoffman · Mehryar Mohri · Ningshan Zhang -
2018 Poster: Uniform Convergence of Gradients for Non-Convex Learning and Optimization »
Dylan Foster · Ayush Sekhari · Karthik Sridharan -
2017 : Mehryar Mohri (NYU) on Tight Learning Bounds for Multi-Class Classification »
Mehryar Mohri -
2017 : (Invited Talk) Mehryar Mohri: Regret minimization against strategic buyers. »
Mehryar Mohri -
2017 Poster: Discriminative State Space Models »
Vitaly Kuznetsov · Mehryar Mohri -
2017 Poster: Spectrally-normalized margin bounds for neural networks »
Peter Bartlett · Dylan J Foster · Matus Telgarsky -
2017 Spotlight: Spectrally-normalized margin bounds for neural networks »
Peter Bartlett · Dylan J Foster · Matus Telgarsky -
2017 Poster: Online Learning with Transductive Regret »
Scott Yang · Mehryar Mohri -
2017 Poster: Parameter-Free Online Learning via Model Selection »
Dylan J Foster · Satyen Kale · Mehryar Mohri · Karthik Sridharan -
2017 Spotlight: Parameter-Free Online Learning via Model Selection »
Dylan J Foster · Satyen Kale · Mehryar Mohri · Karthik Sridharan -
2017 Spotlight: Online Learning with Transductive Regret »
Scott Yang · Mehryar Mohri -
2016 Poster: Structured Prediction Theory Based on Factor Graph Complexity »
Corinna Cortes · Vitaly Kuznetsov · Mehryar Mohri · Scott Yang -
2016 Poster: Learning in Games: Robustness of Fast Convergence »
Dylan Foster · zhiyuan li · Thodoris Lykouris · Karthik Sridharan · Eva Tardos -
2016 Poster: Boosting with Abstention »
Corinna Cortes · Giulia DeSalvo · Mehryar Mohri -
2016 Poster: Optimistic Bandit Convex Optimization »
Scott Yang · Mehryar Mohri -
2016 Tutorial: Theory and Algorithms for Forecasting Non-Stationary Time Series »
Vitaly Kuznetsov · Mehryar Mohri -
2015 : A Theory of Multiple Source Adaptation »
Mehryar Mohri -
2015 : Discussion Panel »
Tim van Erven · Wouter Koolen · Peter Grünwald · Shai Ben-David · Dylan Foster · Satyen Kale · Gergely Neu -
2015 : Adaptive Online Learning »
Dylan Foster -
2015 : Learning Theory and Algorithms for Time Series »
Mehryar Mohri -
2015 Poster: Revenue Optimization against Strategic Buyers »
Mehryar Mohri · Andres Munoz -
2015 Poster: Learning Theory and Algorithms for Forecasting Non-stationary Time Series »
Vitaly Kuznetsov · Mehryar Mohri -
2015 Poster: Adaptive Online Learning »
Dylan Foster · Alexander Rakhlin · Karthik Sridharan -
2015 Spotlight: Adaptive Online Learning »
Dylan Foster · Alexander Rakhlin · Karthik Sridharan -
2015 Oral: Learning Theory and Algorithms for Forecasting Non-stationary Time Series »
Vitaly Kuznetsov · Mehryar Mohri -
2014 Workshop: Second Workshop on Transfer and Multi-Task Learning: Theory meets Practice »
Urun Dogan · Tatiana Tommasi · Yoshua Bengio · Francesco Orabona · Marius Kloft · Andres Munoz · Gunnar Rätsch · Hal Daumé III · Mehryar Mohri · Xuezhi Wang · Daniel Hernández-lobato · Song Liu · Thomas Unterthiner · Pascal Germain · Vinay P Namboodiri · Michael Goetz · Christopher Berlind · Sigurd Spieckermann · Marta Soare · Yujia Li · Vitaly Kuznetsov · Wenzhao Lian · Daniele Calandriello · Emilie Morvant -
2014 Workshop: NIPS Workshop on Transactional Machine Learning and E-Commerce »
David Parkes · David H Wolpert · Jennifer Wortman Vaughan · Jacob D Abernethy · Amos Storkey · Mark Reid · Ping Jin · Nihar Bhadresh Shah · Mehryar Mohri · Luis E Ortiz · Robin Hanson · Aaron Roth · Satyen Kale · Sebastien Lahaie -
2014 Poster: Optimal Regret Minimization in Posted-Price Auctions with Strategic Buyers »
Mehryar Mohri · Andres Munoz -
2014 Poster: Multi-Class Deep Boosting »
Vitaly Kuznetsov · Mehryar Mohri · Umar Syed -
2014 Spotlight: Optimal Regret Minimization in Posted-Price Auctions with Strategic Buyers »
Mehryar Mohri · Andres Munoz -
2014 Session: Oral Session 6 »
Mehryar Mohri -
2014 Poster: Conditional Swap Regret and Conditional Correlated Equilibrium »
Mehryar Mohri · Scott Yang -
2013 Poster: Learning Kernels Using Local Rademacher Complexity »
Corinna Cortes · Marius Kloft · Mehryar Mohri -
2013 Spotlight: Learning Kernels Using Local Rademacher Complexity »
Corinna Cortes · Marius Kloft · Mehryar Mohri -
2012 Poster: Accuracy at the Top »
Stephen Boyd · Corinna Cortes · Mehryar Mohri · Ana Radovanovic -
2012 Poster: Spectral Learning of General Weighted Automata via Constrained Matrix Completion »
Borja Balle · Mehryar Mohri -
2012 Oral: Spectral Learning of General Weighted Automata via Constrained Matrix Completion »
Borja Balle · Mehryar Mohri -
2011 Workshop: Sparse Representation and Low-rank Approximation »
Ameet S Talwalkar · Lester W Mackey · Mehryar Mohri · Michael W Mahoney · Francis Bach · Mike Davies · Remi Gribonval · Guillaume R Obozinski -
2010 Workshop: Low-rank Methods for Large-scale Machine Learning »
Arthur Gretton · Michael W Mahoney · Mehryar Mohri · Ameet S Talwalkar -
2010 Poster: Learning Bounds for Importance Weighting »
Corinna Cortes · Yishay Mansour · Mehryar Mohri -
2009 Poster: Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models »
Gideon S Mann · Ryan McDonald · Mehryar Mohri · Nathan Silberman · Dan Walker -
2009 Poster: Ensemble Nystrom Method »
Sanjiv Kumar · Mehryar Mohri · Ameet S Talwalkar -
2009 Spotlight: Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models »
Gideon S Mann · Ryan McDonald · Mehryar Mohri · Nathan Silberman · Dan Walker -
2009 Poster: Learning Non-Linear Combinations of Kernels »
Corinna Cortes · Mehryar Mohri · Afshin Rostamizadeh -
2009 Poster: Polynomial Semantic Indexing »
Bing Bai · Jason E Weston · David Grangier · Ronan Collobert · Kunihiko Sadamasa · Yanjun Qi · Corinna Cortes · Mehryar Mohri -
2008 Workshop: Kernel Learning: Automatic Selection of Optimal Kernels »
Corinna Cortes · Arthur Gretton · Gert Lanckriet · Mehryar Mohri · Afshin Rostamizadeh -
2008 Poster: Domain Adaptation with Multiple Sources »
Yishay Mansour · Mehryar Mohri · Afshin Rostamizadeh -
2008 Spotlight: Domain Adaptation with Multiple Sources »
Yishay Mansour · Mehryar Mohri · Afshin Rostamizadeh -
2008 Poster: Rademacher Complexity Bounds for Non-I.I.D. Processes »
Mehryar Mohri · Afshin Rostamizadeh -
2007 Poster: Stability Bounds for Non-i.i.d. Processes »
Mehryar Mohri · Afshin Rostamizadeh -
2006 Poster: On Transductive Regression »
Corinna Cortes · Mehryar Mohri