Timezone: »
Poster
Adding One Neuron Can Eliminate All Bad Local Minima
SHIYU LIANG · Ruoyu Sun · Jason Lee · R. Srikant
One of the main difficulties in analyzing neural networks is the non-convexity of the loss function which may have many bad local minima. In this paper, we study the landscape of neural networks for binary classification tasks. Under mild assumptions, we prove that after adding one special neuron with a skip connection to the output, or one special neuron per layer, every local minimum is a global minimum.
Author Information
SHIYU LIANG (UIUC)
Ruoyu Sun (University of Illinois at Urbana-Champaign)
Jason Lee (University of Southern California)
R. Srikant (University of Illinois at Urbana-Champaign)
More from the Same Authors
-
2022 : Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability »
Alex Damian · Eshaan Nichani · Jason Lee -
2023 Poster: PAC-Bayesian Spectrally-Normalized Bounds for Adversarially Robust Generalization »
Jiancong Xiao · Ruoyu Sun · Zhi-Quan Luo -
2023 Poster: Balanced Training for Sparse GANs »
Yite Wang · Jing Wu · NAIRA HOVAKIMYAN · Ruoyu Sun -
2023 Poster: Sample Complexity for Quadratic Bandits: Hessian Dependent Bounds and Optimal Algorithms »
Qian Yu · Yining Wang · Baihe Huang · Qi Lei · Jason Lee -
2023 Poster: Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage »
Masatoshi Uehara · Nathan Kallus · Jason Lee · Wen Sun -
2023 Poster: Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks »
Eshaan Nichani · Alex Damian · Jason Lee -
2023 Poster: Fine-Tuning Language Models with Just Forward Passes »
Sadhika Malladi · Tianyu Gao · Eshaan Nichani · Alex Damian · Jason Lee · Danqi Chen · Sanjeev Arora -
2023 Poster: Performance Bounds for Policy-Based Average Reward Reinforcement Learning Algorithms »
Yashaswini Murthy · Mehrdad Moharrami · R. Srikant -
2023 Poster: Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning »
Gen Li · Wenhao Zhan · Jason Lee · Yuejie Chi · Yuxin Chen -
2023 Poster: Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability »
Jingfeng Wu · Vladimir Braverman · Jason Lee -
2023 Poster: Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models »
Alex Damian · Eshaan Nichani · Rong Ge · Jason Lee -
2023 Oral: Fine-Tuning Language Models with Just Forward Passes »
Sadhika Malladi · Tianyu Gao · Eshaan Nichani · Alex Damian · Jason Lee · Danqi Chen · Sanjeev Arora -
2023 Oral: Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models »
Alex Damian · Eshaan Nichani · Rong Ge · Jason Lee -
2022 Spotlight: Stability Analysis and Generalization Bounds of Adversarial Training »
Jiancong Xiao · Yanbo Fan · Ruoyu Sun · Jue Wang · Zhi-Quan Luo -
2022 Spotlight: Adam Can Converge Without Any Modification On Update Rules »
Yushun Zhang · Congliang Chen · Naichen Shi · Ruoyu Sun · Zhi-Quan Luo -
2022 Spotlight: Lightning Talks 6B-1 »
Yushun Zhang · Duc Nguyen · Jiancong Xiao · Wei Jiang · Yaohua Wang · Yilun Xu · Zhen LI · Anderson Ye Zhang · Ziming Liu · Fangyi Zhang · Gilles Stoltz · Congliang Chen · Gang Li · Yanbo Fan · Ruoyu Sun · Naichen Shi · Yibo Wang · Ming Lin · Max Tegmark · Lijun Zhang · Jue Wang · Ruoyu Sun · Tommi Jaakkola · Senzhang Wang · Zhi-Quan Luo · Xiuyu Sun · Zhi-Quan Luo · Tianbao Yang · Rong Jin -
2022 Spotlight: Lightning Talks 4A-3 »
Zhihan Gao · Yabin Wang · Xingyu Qu · Luziwei Leng · Mingqing Xiao · Bohan Wang · Yu Shen · Zhiwu Huang · Xingjian Shi · Qi Meng · Yupeng Lu · Diyang Li · Qingyan Meng · Kaiwei Che · Yang Li · Hao Wang · Huishuai Zhang · Zongpeng Zhang · Kaixuan Zhang · Xiaopeng Hong · Xiaohan Zhao · Di He · Jianguo Zhang · Yaofeng Tu · Bin Gu · Yi Zhu · Ruoyu Sun · Yuyang (Bernie) Wang · Zhouchen Lin · Qinghu Meng · Wei Chen · Wentao Zhang · Bin CUI · Jie Cheng · Zhi-Ming Ma · Mu Li · Qinghai Guo · Dit-Yan Yeung · Tie-Yan Liu · Jianxing Liao -
2022 Spotlight: Does Momentum Change the Implicit Regularization on Separable Data? »
Bohan Wang · Qi Meng · Huishuai Zhang · Ruoyu Sun · Wei Chen · Zhi-Ming Ma · Tie-Yan Liu -
2022 Poster: Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials »
Eshaan Nichani · Yu Bai · Jason Lee -
2022 Poster: Adam Can Converge Without Any Modification On Update Rules »
Yushun Zhang · Congliang Chen · Naichen Shi · Ruoyu Sun · Zhi-Quan Luo -
2022 Poster: Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems »
Masatoshi Uehara · Ayush Sekhari · Jason Lee · Nathan Kallus · Wen Sun -
2022 Poster: Minimax Regret for Cascading Bandits »
Daniel Vial · Sujay Sanghavi · Sanjay Shakkottai · R. Srikant -
2022 Poster: Does Momentum Change the Implicit Regularization on Separable Data? »
Bohan Wang · Qi Meng · Huishuai Zhang · Ruoyu Sun · Wei Chen · Zhi-Ming Ma · Tie-Yan Liu -
2022 Poster: Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent »
Zhiyuan Li · Tianhao Wang · Jason Lee · Sanjeev Arora -
2022 Poster: Stability Analysis and Generalization Bounds of Adversarial Training »
Jiancong Xiao · Yanbo Fan · Ruoyu Sun · Jue Wang · Zhi-Quan Luo -
2022 Poster: DigGAN: Discriminator gradIent Gap Regularization for GAN Training with Limited Data »
Tiantian Fang · Ruoyu Sun · Alex Schwing -
2022 Poster: On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias »
Itay Safran · Gal Vardi · Jason Lee -
2022 Poster: From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent »
Christopher De Sa · Satyen Kale · Jason Lee · Ayush Sekhari · Karthik Sridharan -
2021 Poster: How Fine-Tuning Allows for Effective Meta-Learning »
Kurtland Chua · Qi Lei · Jason Lee -
2021 Poster: Label Noise SGD Provably Prefers Flat Global Minimizers »
Alex Damian · Tengyu Ma · Jason Lee -
2021 Poster: Going Beyond Linear RL: Sample Efficient Neural Function Approximation »
Baihe Huang · Kaixuan Huang · Sham Kakade · Jason Lee · Qi Lei · Runzhe Wang · Jiaqi Yang -
2021 Poster: Faster Directional Convergence of Linear Neural Networks under Spherically Symmetric Data »
Dachao Lin · Ruoyu Sun · Zhihua Zhang -
2021 Poster: When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work »
Jiawei Zhang · Yushun Zhang · Mingyi Hong · Ruoyu Sun · Zhi-Quan Luo -
2021 Poster: Predicting What You Already Know Helps: Provable Self-Supervised Learning »
Jason Lee · Qi Lei · Nikunj Saunshi · JIACHENG ZHUO -
2021 Poster: Optimal Gradient-based Algorithms for Non-concave Bandit Optimization »
Baihe Huang · Kaixuan Huang · Sham Kakade · Jason Lee · Qi Lei · Runzhe Wang · Jiaqi Yang -
2020 Poster: Towards a Better Global Loss Landscape of GANs »
Ruoyu Sun · Tiantian Fang · Alex Schwing -
2020 Oral: Towards a Better Global Loss Landscape of GANs »
Ruoyu Sun · Tiantian Fang · Alex Schwing -
2020 Poster: A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems »
Jiawei Zhang · Peijun Xiao · Ruoyu Sun · Zhiquan Luo -
2020 Poster: The Mean-Squared Error of Double Q-Learning »
Wentao Weng · Harsh Gupta · Niao He · Lei Ying · R. Srikant -
2019 Poster: Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning »
Harsh Gupta · R. Srikant · Lei Ying -
2018 : Contributed Talk 1 »
Jason Lee -
2018 Poster: Implicit Bias of Gradient Descent on Linear Convolutional Networks »
Suriya Gunasekar · Jason Lee · Daniel Soudry · Nati Srebro -
2018 Poster: Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced »
Simon Du · Wei Hu · Jason Lee -
2018 Poster: Provably Correct Automatic Sub-Differentiation for Qualified Programs »
Sham Kakade · Jason Lee -
2018 Poster: On the Convergence and Robustness of Training GANs with Regularized Optimal Transport »
Maziar Sanjabi · Jimmy Ba · Meisam Razaviyayn · Jason Lee -
2017 Poster: Gradient Descent Can Take Exponential Time to Escape Saddle Points »
Simon Du · Chi Jin · Jason D Lee · Michael Jordan · Aarti Singh · Barnabas Poczos -
2017 Spotlight: Gradient Descent Can Take Exponential Time to Escape Saddle Points »
Simon Du · Chi Jin · Jason D Lee · Michael Jordan · Aarti Singh · Barnabas Poczos -
2016 Oral: Matrix Completion has No Spurious Local Minimum »
Rong Ge · Jason Lee · Tengyu Ma -
2016 Poster: Matrix Completion has No Spurious Local Minimum »
Rong Ge · Jason Lee · Tengyu Ma -
2015 Poster: Algorithms with Logarithmic or Sublinear Regret for Constrained Contextual Bandits »
Huasen Wu · R. Srikant · Xin Liu · Chong Jiang -
2015 Poster: Evaluating the statistical significance of biclusters »
Jason D Lee · Yuekai Sun · Jonathan E Taylor -
2014 Poster: Scalable Methods for Nonnegative Matrix Factorizations of Near-separable Tall-and-skinny Matrices »
Austin Benson · Jason D Lee · Bartek Rajwa · David F Gleich -
2014 Spotlight: Scalable Methods for Nonnegative Matrix Factorizations of Near-separable Tall-and-skinny Matrices »
Austin Benson · Jason D Lee · Bartek Rajwa · David F Gleich -
2014 Poster: Exact Post Model Selection Inference for Marginal Screening »
Jason D Lee · Jonathan E Taylor -
2013 Poster: On model selection consistency of penalized M-estimators: a geometric theory »
Jason D Lee · Yuekai Sun · Jonathan E Taylor -
2013 Poster: Using multiple samples to learn mixture models »
Jason D Lee · Ran Gilad-Bachrach · Rich Caruana -
2013 Spotlight: Using multiple samples to learn mixture models »
Jason D Lee · Ran Gilad-Bachrach · Rich Caruana -
2012 Poster: Proximal Newton-type Methods for Minimizing Convex Objective Functions in Composite Form »
Jason D Lee · Yuekai Sun · Michael Saunders -
2010 Poster: Practical Large-Scale Optimization for Max-norm Regularization »
Jason D Lee · Benjamin Recht · Russ Salakhutdinov · Nati Srebro · Joel A Tropp