Timezone: »
Deep Reinforcement Learning (RL) powered by neural net approximation of the Q function has had enormous empirical success. While the theory of RL has traditionally focused on linear function approximation (or eluder dimension) approaches, little is known about nonlinear RL with neural net approximations of the Q functions. This is the focus of this work, where we study function approximation with two-layer neural networks (considering both ReLU and polynomial activation functions). Our first result is a computationally and statistically efficient algorithm in the generative model setting under completeness for two-layer neural networks. Our second result considers this setting but under only realizability of the neural net function class. Here, assuming deterministic dynamics, the sample complexity scales linearly in the algebraic dimension. In all cases, our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
Author Information
Baihe Huang (Peking University)
Kaixuan Huang (Princeton University)
Sham Kakade (Harvard University & Microsoft Research)
Jason Lee (University of Southern California)
Qi Lei (Princeton University)
Runzhe Wang (Princeton University)
Jiaqi Yang (Tsinghua University)
More from the Same Authors
-
2021 : Policy Mirror Descent for Regularized RL: A Generalized Framework with Linear Convergence »
Wenhao Zhan · Shicong Cen · Baihe Huang · Yuxin Chen · Jason Lee · Yuejie Chi -
2021 : Policy Mirror Descent for Regularized RL: A Generalized Framework with Linear Convergence »
Wenhao Zhan · Shicong Cen · Baihe Huang · Yuxin Chen · Jason Lee · Yuejie Chi -
2022 : Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability »
Alex Damian · Eshaan Nichani · Jason Lee -
2022 Spotlight: Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime »
Difan Zou · Jingfeng Wu · Vladimir Braverman · Quanquan Gu · Sham Kakade -
2022 Poster: Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials »
Eshaan Nichani · Yu Bai · Jason Lee -
2022 Poster: Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems »
Masatoshi Uehara · Ayush Sekhari · Jason Lee · Nathan Kallus · Wen Sun -
2022 Poster: Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent »
Zhiyuan Li · Tianhao Wang · Jason Lee · Sanjeev Arora -
2022 Poster: On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias »
Itay Safran · Gal Vardi · Jason Lee -
2022 Poster: From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent »
Christopher De Sa · Satyen Kale · Jason Lee · Ayush Sekhari · Karthik Sridharan -
2021 : Invited Speaker Panel »
Sham Kakade · Minmin Chen · Philip Thomas · Angela Schoellig · Barbara Engelhardt · Doina Precup · George Tucker -
2021 : Q&A for Sham Kakade »
Sham Kakade -
2021 : Generalization theory in Offline RL »
Sham Kakade -
2021 Poster: How Fine-Tuning Allows for Effective Meta-Learning »
Kurtland Chua · Qi Lei · Jason Lee -
2021 Poster: The Benefits of Implicit Regularization from SGD in Least Squares Problems »
Difan Zou · Jingfeng Wu · Vladimir Braverman · Quanquan Gu · Dean Foster · Sham Kakade -
2021 Poster: Robust and differentially private mean estimation »
Xiyang Liu · Weihao Kong · Sham Kakade · Sewoong Oh -
2021 Poster: An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap »
Yuanhao Wang · Ruosong Wang · Sham Kakade -
2021 Poster: Label Noise SGD Provably Prefers Flat Global Minimizers »
Alex Damian · Tengyu Ma · Jason Lee -
2021 Poster: Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP »
Zihan Zhang · Jiaqi Yang · Xiangyang Ji · Simon Du -
2021 Poster: LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes »
Aditya Kusupati · Matthew Wallingford · Vivek Ramanujan · Raghav Somani · Jae Sung Park · Krishna Pillutla · Prateek Jain · Sham Kakade · Ali Farhadi -
2021 Poster: Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias »
Kaifeng Lyu · Zhiyuan Li · Runzhe Wang · Sanjeev Arora -
2021 Poster: Gone Fishing: Neural Active Learning with Fisher Embeddings »
Jordan Ash · Surbhi Goel · Akshay Krishnamurthy · Sham Kakade -
2021 Poster: Fast Federated Learning in the Presence of Arbitrary Device Unavailability »
Xinran Gu · Kaixuan Huang · Jingzhao Zhang · Longbo Huang -
2021 Poster: Predicting What You Already Know Helps: Provable Self-Supervised Learning »
Jason Lee · Qi Lei · Nikunj Saunshi · JIACHENG ZHUO -
2021 Poster: Optimal Gradient-based Algorithms for Non-concave Bandit Optimization »
Baihe Huang · Kaixuan Huang · Sham Kakade · Jason Lee · Qi Lei · Runzhe Wang · Jiaqi Yang -
2021 Poster: Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature »
Kefan Dong · Jiaqi Yang · Tengyu Ma -
2021 Oral: An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap »
Yuanhao Wang · Ruosong Wang · Sham Kakade -
2020 Tutorial: (Track3) Policy Optimization in Reinforcement Learning Q&A »
Sham M Kakade · Martha White · Nicolas Le Roux -
2020 Poster: Robust Meta-learning for Mixed Linear Regression with Small Batches »
Weihao Kong · Raghav Somani · Sham Kakade · Sewoong Oh -
2020 Poster: Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? --- A Neural Tangent Kernel Perspective »
Kaixuan Huang · Yuqing Wang · Molei Tao · Tuo Zhao -
2020 Poster: Is Long Horizon RL More Difficult Than Short Horizon RL? »
Ruosong Wang · Simon Du · Lin Yang · Sham Kakade -
2020 Poster: FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs »
Alekh Agarwal · Sham Kakade · Akshay Krishnamurthy · Wen Sun -
2020 Poster: PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning »
Alekh Agarwal · Mikael Henaff · Sham Kakade · Wen Sun -
2020 Poster: Sample-Efficient Reinforcement Learning of Undercomplete POMDPs »
Chi Jin · Sham Kakade · Akshay Krishnamurthy · Qinghua Liu -
2020 Spotlight: Sample-Efficient Reinforcement Learning of Undercomplete POMDPs »
Chi Jin · Sham Kakade · Akshay Krishnamurthy · Qinghua Liu -
2020 Oral: FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs »
Alekh Agarwal · Sham Kakade · Akshay Krishnamurthy · Wen Sun -
2020 Poster: Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity »
Kaiqing Zhang · Sham Kakade · Tamer Basar · Lin Yang -
2020 Poster: Information Theoretic Regret Bounds for Online Nonlinear Control »
Sham Kakade · Akshay Krishnamurthy · Kendall Lowrey · Motoya Ohnishi · Wen Sun -
2020 Spotlight: Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity »
Kaiqing Zhang · Sham Kakade · Tamer Basar · Lin Yang -
2020 Tutorial: (Track3) Policy Optimization in Reinforcement Learning »
Sham M Kakade · Martha White · Nicolas Le Roux -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Poster Spotlight 2 »
Aaron Sidford · Mengdi Wang · Lin Yang · Yinyu Ye · Zuyue Fu · Zhuoran Yang · Yongxin Chen · Zhaoran Wang · Ofir Nachum · Bo Dai · Ilya Kostrikov · Dale Schuurmans · Ziyang Tang · Yihao Feng · Lihong Li · Denny Zhou · Qiang Liu · Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Simon Du · Sham Kakade · Ruosong Wang · Minshuo Chen · Tianyi Liu · Xingguo Li · Zhaoran Wang · Tuo Zhao · Philip Amortila · Doina Precup · Prakash Panangaden · Marc Bellemare -
2019 : The Provable Effectiveness of Policy Gradient Methods in Reinforcement Learning »
Sham Kakade -
2019 Poster: Inverting Deep Generative models, One layer at a time »
Qi Lei · Ajil Jalal · Inderjit Dhillon · Alex Dimakis -
2019 Poster: The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares »
Rong Ge · Sham Kakade · Rahul Kidambi · Praneeth Netrapalli -
2019 Poster: Primal-Dual Block Generalized Frank-Wolfe »
Qi Lei · JIACHENG ZHUO · Constantine Caramanis · Inderjit Dhillon · Alex Dimakis -
2019 Poster: Meta-Learning with Implicit Gradients »
Aravind Rajeswaran · Chelsea Finn · Sham Kakade · Sergey Levine -
2018 : Contributed Talk 1 »
Jason Lee -
2018 Poster: A Smoother Way to Train Structured Prediction Models »
Krishna Pillutla · Vincent Roulet · Sham Kakade · Zaid Harchaoui -
2018 Poster: Implicit Bias of Gradient Descent on Linear Convolutional Networks »
Suriya Gunasekar · Jason Lee · Daniel Soudry · Nati Srebro -
2018 Poster: Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced »
Simon Du · Wei Hu · Jason Lee -
2018 Poster: Adding One Neuron Can Eliminate All Bad Local Minima »
SHIYU LIANG · Ruoyu Sun · Jason Lee · R. Srikant -
2018 Poster: Provably Correct Automatic Sub-Differentiation for Qualified Programs »
Sham Kakade · Jason Lee -
2018 Poster: Hessian-based Analysis of Large Batch Training and Robustness to Adversaries »
Zhewei Yao · Amir Gholami · Qi Lei · Kurt Keutzer · Michael Mahoney -
2018 Poster: On the Convergence and Robustness of Training GANs with Regularized Optimal Transport »
Maziar Sanjabi · Jimmy Ba · Meisam Razaviyayn · Jason Lee -
2017 Poster: A Greedy Approach for Budgeted Maximum Inner Product Search »
Hsiang-Fu Yu · Cho-Jui Hsieh · Qi Lei · Inderjit Dhillon -
2017 Poster: Learning Overcomplete HMMs »
Vatsal Sharan · Sham Kakade · Percy Liang · Gregory Valiant -
2017 Poster: Gradient Descent Can Take Exponential Time to Escape Saddle Points »
Simon Du · Chi Jin · Jason D Lee · Michael Jordan · Aarti Singh · Barnabas Poczos -
2017 Spotlight: Gradient Descent Can Take Exponential Time to Escape Saddle Points »
Simon Du · Chi Jin · Jason D Lee · Michael Jordan · Aarti Singh · Barnabas Poczos -
2017 Poster: Towards Generalization and Simplicity in Continuous Control »
Aravind Rajeswaran · Kendall Lowrey · Emanuel Todorov · Sham Kakade -
2016 Poster: Coordinate-wise Power Method »
Qi Lei · Kai Zhong · Inderjit Dhillon -
2016 Poster: Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent »
Chi Jin · Sham Kakade · Praneeth Netrapalli -
2016 Oral: Matrix Completion has No Spurious Local Minimum »
Rong Ge · Jason Lee · Tengyu Ma -
2016 Poster: Matrix Completion has No Spurious Local Minimum »
Rong Ge · Jason Lee · Tengyu Ma -
2015 Poster: Convergence Rates of Active Learning for Maximum Likelihood Estimation »
Kamalika Chaudhuri · Sham Kakade · Praneeth Netrapalli · Sujay Sanghavi -
2015 Poster: Evaluating the statistical significance of biclusters »
Jason D Lee · Yuekai Sun · Jonathan E Taylor -
2015 Poster: Super-Resolution Off the Grid »
Qingqing Huang · Sham Kakade -
2015 Spotlight: Super-Resolution Off the Grid »
Qingqing Huang · Sham Kakade -
2014 Poster: Scalable Methods for Nonnegative Matrix Factorizations of Near-separable Tall-and-skinny Matrices »
Austin Benson · Jason D Lee · Bartek Rajwa · David F Gleich -
2014 Spotlight: Scalable Methods for Nonnegative Matrix Factorizations of Near-separable Tall-and-skinny Matrices »
Austin Benson · Jason D Lee · Bartek Rajwa · David F Gleich -
2014 Poster: Exact Post Model Selection Inference for Marginal Screening »
Jason D Lee · Jonathan E Taylor -
2013 Poster: On model selection consistency of penalized M-estimators: a geometric theory »
Jason D Lee · Yuekai Sun · Jonathan E Taylor -
2013 Poster: Using multiple samples to learn mixture models »
Jason D Lee · Ran Gilad-Bachrach · Rich Caruana -
2013 Spotlight: Using multiple samples to learn mixture models »
Jason D Lee · Ran Gilad-Bachrach · Rich Caruana -
2013 Poster: When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity »
Anima Anandkumar · Daniel Hsu · Majid Janzamin · Sham M Kakade -
2012 Poster: Proximal Newton-type Methods for Minimizing Convex Objective Functions in Composite Form »
Jason D Lee · Yuekai Sun · Michael Saunders -
2012 Poster: Learning Mixtures of Tree Graphical Models »
Anima Anandkumar · Daniel Hsu · Furong Huang · Sham M Kakade -
2012 Poster: A Spectral Algorithm for Latent Dirichlet Allocation »
Anima Anandkumar · Dean P Foster · Daniel Hsu · Sham M Kakade · Yi-Kai Liu -
2012 Poster: Identifiability and Unmixing of Latent Parse Trees »
Percy Liang · Sham M Kakade · Daniel Hsu -
2012 Spotlight: A Spectral Algorithm for Latent Dirichlet Allocation »
Anima Anandkumar · Dean P Foster · Daniel Hsu · Sham M Kakade · Yi-Kai Liu -
2011 Poster: Stochastic convex optimization with bandit feedback »
Alekh Agarwal · Dean P Foster · Daniel Hsu · Sham M Kakade · Sasha Rakhlin -
2011 Poster: Spectral Methods for Learning Multivariate Latent Tree Structure »
Anima Anandkumar · Kamalika Chaudhuri · Daniel Hsu · Sham M Kakade · Le Song · Tong Zhang -
2011 Poster: Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression »
Sham M Kakade · Adam Kalai · Varun Kanade · Ohad Shamir -
2010 Spotlight: Learning from Logged Implicit Exploration Data »
Alex Strehl · Lihong Li · John Langford · Sham M Kakade -
2010 Poster: Learning from Logged Implicit Exploration Data »
Alexander L Strehl · John Langford · Lihong Li · Sham M Kakade -
2010 Poster: Practical Large-Scale Optimization for Max-norm Regularization »
Jason D Lee · Benjamin Recht · Russ Salakhutdinov · Nati Srebro · Joel A Tropp -
2009 Poster: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2009 Oral: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2008 Poster: Mind the Duality Gap: Logarithmic regret algorithms for online optimization »
Shai Shalev-Shwartz · Sham M Kakade -
2008 Poster: On the Generalization Ability of Online Strongly Convex Programming Algorithms »
Sham M Kakade · Ambuj Tewari -
2008 Spotlight: On the Generalization Ability of Online Strongly Convex Programming Algorithms »
Sham M Kakade · Ambuj Tewari -
2008 Spotlight: Mind the Duality Gap: Logarithmic regret algorithms for online optimization »
Shai Shalev-Shwartz · Sham M Kakade -
2008 Poster: On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization »
Sham M Kakade · Karthik Sridharan · Ambuj Tewari -
2007 Oral: The Price of Bandit Information for Online Optimization »
Varsha Dani · Thomas P Hayes · Sham M Kakade -
2007 Poster: The Price of Bandit Information for Online Optimization »
Varsha Dani · Thomas P Hayes · Sham M Kakade