Timezone: »
Neural networks have many successful applications, while much less theoretical understanding has been gained. Towards bridging this gap, we study the problem of learning a two-layer overparameterized ReLU neural network for multi-class classification via stochastic gradient descent (SGD) from random initialization. In the overparameterized setting, when the data comes from mixtures of well-separated distributions, we prove that SGD learns a network with a small generalization error, albeit the network has enough capacity to fit arbitrary labels. Furthermore, the analysis provides interesting insights into several aspects of learning neural networks and can be verified based on empirical studies on synthetic data and on the MNIST dataset.
Author Information
Yuanzhi Li (Princeton)
Yingyu Liang (University of Wisconsin Madison)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Spotlight: Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data »
Tue. Dec 4th 03:20 -- 03:25 PM Room Room 220 CD
More from the Same Authors
-
2022 : Domain Generalization with Nuclear Norm Regularization »
Zhenmei Shi · Yifei Ming · Ying Fan · Frederic Sala · Yingyu Liang -
2022 : Best of Both Worlds: Towards Adversarial Robustness with Transduction and Rejection »
Nils Palumbo · Yang Guo · Xi Wu · Jiefeng Chen · Yingyu Liang · Somesh Jha -
2020 Poster: Functional Regularization for Representation Learning: A Unified Theoretical Perspective »
Siddhant Garg · Yingyu Liang -
2019 Poster: N-Gram Graph: Simple Unsupervised Representation for Graphs, with Applications to Molecules »
Shengchao Liu · Mehmet Demirel · Yingyu Liang -
2019 Spotlight: N-Gram Graph: Simple Unsupervised Representation for Graphs, with Applications to Molecules »
Shengchao Liu · Mehmet Demirel · Yingyu Liang -
2019 Poster: On the Convergence Rate of Training Recurrent Neural Networks »
Zeyuan Allen-Zhu · Yuanzhi Li · Zhao Song -
2019 Poster: Robust Attribution Regularization »
Jiefeng Chen · Xi Wu · Vaibhav Rastogi · Yingyu Liang · Somesh Jha -
2019 Poster: What Can ResNet Learn Efficiently, Going Beyond Kernels? »
Zeyuan Allen-Zhu · Yuanzhi Li -
2019 Poster: Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers »
Zeyuan Allen-Zhu · Yuanzhi Li · Yingyu Liang -
2019 Poster: Complexity of Highly Parallel Non-Smooth Convex Optimization »
Sebastien Bubeck · Qijia Jiang · Yin-Tat Lee · Yuanzhi Li · Aaron Sidford -
2019 Spotlight: Complexity of Highly Parallel Non-Smooth Convex Optimization »
Sebastien Bubeck · Qijia Jiang · Yin-Tat Lee · Yuanzhi Li · Aaron Sidford -
2019 Poster: Can SGD Learn Recurrent Neural Networks with Provable Generalization? »
Zeyuan Allen-Zhu · Yuanzhi Li -
2019 Poster: Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks »
Yuanzhi Li · Colin Wei · Tengyu Ma -
2019 Spotlight: Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks »
Yuanzhi Li · Colin Wei · Tengyu Ma -
2018 Poster: Online Improper Learning with an Approximation Oracle »
Elad Hazan · Wei Hu · Yuanzhi Li · Zhiyuan Li -
2018 Poster: NEON2: Finding Local Minima via First-Order Oracles »
Zeyuan Allen-Zhu · Yuanzhi Li