Timezone: »
Poster
Reinforcement Learning with Logarithmic Regret and Policy Switches
Grigoris Velegkas · Zhuoran Yang · Amin Karbasi
In this paper, we study the problem of regret minimization for episodic Reinforcement Learning (RL) both in the model-free and the model-based setting. We focus on learning with general function classes and general model classes, and we derive results that scale with the eluder dimension of these classes. In contrast to the existing body of work that mainly establishes instance-independent regret guarantees, we focus on the instance-dependent setting and show that the regret scales logarithmically with the horizon $T$, provided that there is a gap between the best and the second best action in every state. In addition, we show that such a logarithmic regret bound is realizable by algorithms with $O(\log T)$ switching cost (also known as adaptivity complexity). In other words, these algorithms rarely switch their policy during the course of their execution. Finally, we complement our results with lower bounds which show that even in the tabular setting, we cannot hope for regret guarantees lower than $O(\log T)$.
Author Information
Grigoris Velegkas (Yale University)
Zhuoran Yang (Yale University)
Amin Karbasi (Yale University)
More from the Same Authors
-
2022 : Exact Gradient Computation for Spiking Neural Networks »
Jane Lee · Saeid Haghighatshoar · Amin Karbasi -
2022 : Sparse Q-Learning: Offline Reinforcement Learning with Implicit Value Regularization »
Haoran Xu · Li Jiang · Li Jianxiong · Zhuoran Yang · Zhaoran Wang · Xianyuan Zhan -
2023 Poster: Optimal Learners for Realizable Regression: PAC Learning and Online Learning »
Idan Attias · Steve Hanneke · Alkis Kalavasis · Amin Karbasi · Grigoris Velegkas -
2023 Poster: Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization »
Liang Zhang · Junchi YANG · Amin Karbasi · Niao He -
2023 Poster: Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning »
Haoran He · Chenjia Bai · Kang Xu · Zhuoran Yang · Weinan Zhang · Dong Wang · Bin Zhao · Xuelong Li -
2023 Poster: Learning Regularized Monotone Graphon Mean-Field Games »
Fengzhuo Zhang · Vincent Tan · Zhaoran Wang · Zhuoran Yang -
2023 Poster: Posterior Sampling for Competitive RL: Function Approximation and Partial Observation »
Shuang Qiu · Ziyu Dai · Han Zhong · Zhaoran Wang · Zhuoran Yang · Tong Zhang -
2023 Poster: Online Performative Gradient Descent for Learning Nash Equilibria in Decision-Dependent Games »
Zihan Zhu · Ethan Fang · Zhuoran Yang -
2023 Poster: One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration »
Zhihan Liu · Miao Lu · WEI XIONG · Han Zhong · Hao Hu · Shenao Zhang · Sirui Zheng · Zhuoran Yang · Zhaoran Wang -
2023 Poster: Replicability in Reinforcement Learning »
Amin Karbasi · Grigoris Velegkas · Lin Yang · Felix Zhou -
2023 Poster: Replicable Clustering »
Hossein Esfandiari · Amin Karbasi · Vahab Mirrokni · Grigoris Velegkas · Felix Zhou -
2023 Oral: Optimal Learners for Realizable Regression: PAC Learning and Online Learning »
Idan Attias · Steve Hanneke · Alkis Kalavasis · Amin Karbasi · Grigoris Velegkas -
2022 Panel: Panel 1C-6: Debiased Self-Training for… & Universal Rates for… »
Grigoris Velegkas · Baixu Chen -
2022 Poster: Inducing Equilibria via Incentives: Simultaneous Design-and-Play Ensures Global Convergence »
Boyi Liu · Jiayang Li · Zhuoran Yang · Hoi-To Wai · Mingyi Hong · Yu Nie · Zhaoran Wang -
2022 Poster: A Unifying Framework of Off-Policy General Value Function Evaluation »
Tengyu Xu · Zhuoran Yang · Zhaoran Wang · Yingbin Liang -
2022 Poster: Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL »
Fengzhuo Zhang · Boyi Liu · Kaixin Wang · Vincent Tan · Zhuoran Yang · Zhaoran Wang -
2022 Poster: Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets »
Yifei Min · Tianhao Wang · Ruitu Xu · Zhaoran Wang · Michael Jordan · Zhuoran Yang -
2022 Poster: Submodular Maximization in Clean Linear Time »
Wenxin Li · Moran Feldman · Ehsan Kazemi · Amin Karbasi -
2022 Poster: Universal Rates for Interactive Learning »
Steve Hanneke · Amin Karbasi · Shay Moran · Grigoris Velegkas -
2022 Poster: Black-Box Generalization: Stability of Zeroth-Order Learning »
Konstantinos Nikolakakis · Farzin Haddadpour · Dionysis Kalogerias · Amin Karbasi -
2022 Poster: Multiclass Learnability Beyond the PAC Framework: Universal Rates and Partial Concept Classes »
Alkis Kalavasis · Grigoris Velegkas · Amin Karbasi -
2022 Poster: Fast Neural Kernel Embeddings for General Activations »
Insu Han · Amir Zandieh · Jaehoon Lee · Roman Novak · Lechao Xiao · Amin Karbasi -
2022 Poster: On Optimal Learning Under Targeted Data Poisoning »
Steve Hanneke · Amin Karbasi · Mohammad Mahmoody · Idan Mehalel · Shay Moran -
2022 Poster: Exponential Family Model-Based Reinforcement Learning via Score Matching »
Gene Li · Junbo Li · Anmol Kabra · Nati Srebro · Zhaoran Wang · Zhuoran Yang -
2021 Poster: An Exponential Improvement on the Memorization Capacity of Deep Threshold Networks »
Shashank Rajput · Kartik Sreenivasan · Dimitris Papailiopoulos · Amin Karbasi -
2021 Poster: Multiple Descent: Design Your Own Generalization Curve »
Lin Chen · Yifei Min · Mikhail Belkin · Amin Karbasi -
2021 Poster: Parallelizing Thompson Sampling »
Amin Karbasi · Vahab Mirrokni · Mohammad Shadravan -
2021 Poster: Submodular + Concave »
Siddharth Mitra · Moran Feldman · Amin Karbasi -
2013 Poster: Noise-Enhanced Associative Memories »
Amin Karbasi · Amir Hesam Salavati · Amin Shokrollahi · Lav R Varshney -
2013 Poster: Distributed Submodular Maximization: Identifying Representative Elements in Massive Data »
Baharan Mirzasoleiman · Amin Karbasi · Rik Sarkar · Andreas Krause -
2013 Spotlight: Noise-Enhanced Associative Memories »
Amin Karbasi · Amir Hesam Salavati · Amin Shokrollahi · Lav R Varshney