Timezone: »
Poster
Agnostic $Q$-learning with Function Approximation in Deterministic Systems: Near-Optimal Bounds on Approximation Error and Sample Complexity
Simon Du · Jason Lee · Gaurav Mahajan · Ruosong Wang
The current paper studies the problem of agnostic $Q$-learning with function approximation in deterministic systems where the optimal $Q$-function is approximable by a function in the class $\mathcal{F}$ with approximation error $\delta \ge 0$. We propose a novel recursion-based algorithm and show that if $\delta = O\left(\rho/\sqrt{\dim_E}\right)$, then one can find the optimal policy using $O(\dim_E)$ trajectories, where $\rho$ is the gap between the optimal $Q$-value of the best actions and that of the second-best actions and $\dim_E$ is the Eluder dimension of $\mathcal{F}$. Our result has two implications:
\begin{enumerate}
\item In conjunction with the lower bound in [Du et al., 2020], our upper bound suggests that the condition $\delta = \widetilde{\Theta}\left(\rho/\sqrt{\dim_E}\right)$ is necessary and sufficient for algorithms with polynomial sample complexity.
\item In conjunction with the obvious lower bound in the tabular case, our upper bound suggests that the sample complexity $\widetilde{\Theta}\left(\dim_E\right)$ is tight in the agnostic setting.
\end{enumerate}
Therefore, we help address the open problem on agnostic $Q$-learning proposed in [Wen and Van Roy, 2013]. We further extend our algorithm to the stochastic reward setting and obtain similar results.
Author Information
Simon Du (Institute for Advanced Study)
Jason Lee (Princeton University)
Gaurav Mahajan (University of California, San Diego)
Ruosong Wang (Carnegie Mellon University)
More from the Same Authors
-
2021 : Invited talk 7 »
Jason Lee -
2021 Poster: An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap »
Yuanhao Wang · Ruosong Wang · Sham Kakade -
2021 Oral: An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap »
Yuanhao Wang · Ruosong Wang · Sham Kakade -
2020 : Contributed Talk 6: What are the Statistical Limits for Batch RL with Linear Function Approximation? »
Ruosong Wang -
2020 Poster: Generalized Leverage Score Sampling for Neural Networks »
Jason Lee · Ruoqi Shen · Zhao Song · Mengdi Wang · zheng Yu -
2020 Poster: Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters »
Kaiyi Ji · Jason Lee · Yingbin Liang · H. Vincent Poor -
2020 Poster: Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality »
Yi Zhang · Orestis Plevrakis · Simon Du · Xingguo Li · Zhao Song · Sanjeev Arora -
2020 Poster: Beyond Lazy Training for Over-parameterized Tensor Decomposition »
Xiang Wang · Chenwei Wu · Jason Lee · Tengyu Ma · Rong Ge -
2020 Poster: Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy »
Edward Moroshko · Blake Woodworth · Suriya Gunasekar · Jason Lee · Nati Srebro · Daniel Soudry -
2020 Spotlight: Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy »
Edward Moroshko · Blake Woodworth · Suriya Gunasekar · Jason Lee · Nati Srebro · Daniel Soudry -
2020 Poster: Planning with General Objective Functions: Going Beyond Total Rewards »
Ruosong Wang · Peilin Zhong · Simon Du · Russ Salakhutdinov · Lin Yang -
2020 Poster: Is Long Horizon RL More Difficult Than Short Horizon RL? »
Ruosong Wang · Simon Du · Lin Yang · Sham Kakade -
2020 Poster: Preference-based Reinforcement Learning with Finite-Time Guarantees »
Yichong Xu · Ruosong Wang · Lin Yang · Aarti Singh · Artur Dubrawski -
2020 Spotlight: Preference-based Reinforcement Learning with Finite-Time Guarantees »
Yichong Xu · Ruosong Wang · Lin Yang · Aarti Singh · Artur Dubrawski -
2020 Poster: Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot »
Jingtong Su · Yihang Chen · Tianle Cai · Tianhao Wu · Ruiqi Gao · Liwei Wang · Jason Lee -
2020 Poster: On Reward-Free Reinforcement Learning with Linear Function Approximation »
Ruosong Wang · Simon Du · Lin Yang · Russ Salakhutdinov -
2020 Poster: Towards Understanding Hierarchical Learning: Benefits of Neural Representations »
Minshuo Chen · Yu Bai · Jason Lee · Tuo Zhao · Huan Wang · Caiming Xiong · Richard Socher -
2020 Poster: Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning »
Fei Feng · Ruosong Wang · Wotao Yin · Simon Du · Lin Yang -
2020 Poster: Reinforcement Learning with General Value Function Approximation: Provably Efficient Approach via Bounded Eluder Dimension »
Ruosong Wang · Russ Salakhutdinov · Lin Yang -
2020 Poster: How to Characterize The Landscape of Overparameterized Convolutional Neural Networks »
Yihong Gu · Weizhong Zhang · Cong Fang · Jason Lee · Tong Zhang -
2020 Spotlight: Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning »
Fei Feng · Ruosong Wang · Wotao Yin · Simon Du · Lin Yang -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Late-Breaking Papers (Talks) »
David Silver · Simon Du · Matthias Plappert -
2019 : Poster Spotlight 2 »
Aaron Sidford · Mengdi Wang · Lin Yang · Yinyu Ye · Zuyue Fu · Zhuoran Yang · Yongxin Chen · Zhaoran Wang · Ofir Nachum · Bo Dai · Ilya Kostrikov · Dale Schuurmans · Ziyang Tang · Yihao Feng · Lihong Li · Denny Zhou · Qiang Liu · Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Simon Du · Sham Kakade · Ruosong Wang · Minshuo Chen · Tianyi Liu · Xingguo Li · Zhaoran Wang · Tuo Zhao · Philip Amortila · Doina Precup · Prakash Panangaden · Marc Bellemare -
2019 Poster: Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel »
Colin Wei · Jason Lee · Qiang Liu · Tengyu Ma -
2019 Spotlight: Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel »
Colin Wei · Jason Lee · Qiang Liu · Tengyu Ma -
2019 Poster: Towards Understanding the Importance of Shortcut Connections in Residual Networks »
Tianyi Liu · Minshuo Chen · Mo Zhou · Simon Du · Enlu Zhou · Tuo Zhao -
2019 Poster: Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods »
Maher Nouiehed · Maziar Sanjabi · Tianjian Huang · Jason Lee · Meisam Razaviyayn -
2019 Poster: Convergence of Adversarial Training in Overparametrized Neural Networks »
Ruiqi Gao · Tianle Cai · Haochuan Li · Cho-Jui Hsieh · Liwei Wang · Jason Lee -
2019 Spotlight: Convergence of Adversarial Training in Overparametrized Neural Networks »
Ruiqi Gao · Tianle Cai · Haochuan Li · Cho-Jui Hsieh · Liwei Wang · Jason Lee -
2019 Poster: Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels »
Simon Du · Kangcheng Hou · Russ Salakhutdinov · Barnabas Poczos · Ruosong Wang · Keyulu Xu -
2019 Poster: Neural Temporal-Difference Learning Converges to Global Optima »
Qi Cai · Zhuoran Yang · Jason Lee · Zhaoran Wang -
2019 Poster: Efficient Symmetric Norm Regression via Linear Sketching »
Zhao Song · Ruosong Wang · Lin Yang · Hongyang Zhang · Peilin Zhong -
2019 Poster: Acceleration via Symplectic Discretization of High-Resolution Differential Equations »
Bin Shi · Simon Du · Weijie Su · Michael Jordan -
2019 Poster: Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle »
Simon Du · Yuping Luo · Ruosong Wang · Hanrui Zhang -
2019 Poster: On Exact Computation with an Infinitely Wide Neural Net »
Sanjeev Arora · Simon Du · Wei Hu · Zhiyuan Li · Russ Salakhutdinov · Ruosong Wang -
2019 Spotlight: On Exact Computation with an Infinitely Wide Neural Net »
Sanjeev Arora · Simon Du · Wei Hu · Zhiyuan Li · Russ Salakhutdinov · Ruosong Wang