Timezone: »
Poster
Faster Directional Convergence of Linear Neural Networks under Spherically Symmetric Data
Dachao Lin · Ruoyu Sun · Zhihua Zhang
In this paper, we study gradient methods for training deep linear neural networks with binary cross-entropy loss. In particular, we show global directional convergence guarantees from a polynomial rate to a linear rate for (deep) linear networks with spherically symmetric data distribution, which can be viewed as a specific zero-margin dataset. Our results do not require the assumptions in other works such as small initial loss, presumed convergence of weight direction, or overparameterization. We also characterize our findings in experiments.
Author Information
Dachao Lin (Peking University)
Ruoyu Sun (University of Illinois at Urbana-Champaign)
Zhihua Zhang (Shanghai Jiao Tong University)
More from the Same Authors
-
2023 Poster: PAC-Bayesian Spectrally-Normalized Bounds for Adversarially Robust Generalization »
Jiancong Xiao · Ruoyu Sun · Zhi-Quan Luo -
2023 Poster: Stochastic Distributed Optimization under Average Second-order Similarity: Algorithms and Analysis »
Dachao Lin · Yuze Han · Haishan Ye · Zhihua Zhang -
2023 Poster: Balanced Training for Sparse GANs »
Yite Wang · Jing Wu · NAIRA HOVAKIMYAN · Ruoyu Sun -
2022 Spotlight: Stability Analysis and Generalization Bounds of Adversarial Training »
Jiancong Xiao · Yanbo Fan · Ruoyu Sun · Jue Wang · Zhi-Quan Luo -
2022 Spotlight: Adam Can Converge Without Any Modification On Update Rules »
Yushun Zhang · Congliang Chen · Naichen Shi · Ruoyu Sun · Zhi-Quan Luo -
2022 Spotlight: Lightning Talks 6B-1 »
Yushun Zhang · Duc Nguyen · Jiancong Xiao · Wei Jiang · Yaohua Wang · Yilun Xu · Zhen LI · Anderson Ye Zhang · Ziming Liu · Fangyi Zhang · Gilles Stoltz · Congliang Chen · Gang Li · Yanbo Fan · Ruoyu Sun · Naichen Shi · Yibo Wang · Ming Lin · Max Tegmark · Lijun Zhang · Jue Wang · Ruoyu Sun · Tommi Jaakkola · Senzhang Wang · Zhi-Quan Luo · Xiuyu Sun · Zhi-Quan Luo · Tianbao Yang · Rong Jin -
2022 Spotlight: Lightning Talks 4A-3 »
Zhihan Gao · Yabin Wang · Xingyu Qu · Luziwei Leng · Mingqing Xiao · Bohan Wang · Yu Shen · Zhiwu Huang · Xingjian Shi · Qi Meng · Yupeng Lu · Diyang Li · Qingyan Meng · Kaiwei Che · Yang Li · Hao Wang · Huishuai Zhang · Zongpeng Zhang · Kaixuan Zhang · Xiaopeng Hong · Xiaohan Zhao · Di He · Jianguo Zhang · Yaofeng Tu · Bin Gu · Yi Zhu · Ruoyu Sun · Yuyang (Bernie) Wang · Zhouchen Lin · Qinghu Meng · Wei Chen · Wentao Zhang · Bin CUI · Jie Cheng · Zhi-Ming Ma · Mu Li · Qinghai Guo · Dit-Yan Yeung · Tie-Yan Liu · Jianxing Liao -
2022 Spotlight: Does Momentum Change the Implicit Regularization on Separable Data? »
Bohan Wang · Qi Meng · Huishuai Zhang · Ruoyu Sun · Wei Chen · Zhi-Ming Ma · Tie-Yan Liu -
2022 Poster: Adam Can Converge Without Any Modification On Update Rules »
Yushun Zhang · Congliang Chen · Naichen Shi · Ruoyu Sun · Zhi-Quan Luo -
2022 Poster: Does Momentum Change the Implicit Regularization on Separable Data? »
Bohan Wang · Qi Meng · Huishuai Zhang · Ruoyu Sun · Wei Chen · Zhi-Ming Ma · Tie-Yan Liu -
2022 Poster: Stability Analysis and Generalization Bounds of Adversarial Training »
Jiancong Xiao · Yanbo Fan · Ruoyu Sun · Jue Wang · Zhi-Quan Luo -
2022 Poster: DigGAN: Discriminator gradIent Gap Regularization for GAN Training with Limited Data »
Tiantian Fang · Ruoyu Sun · Alex Schwing -
2021 Poster: Greedy and Random Quasi-Newton Methods with Faster Explicit Superlinear Convergence »
Dachao Lin · Haishan Ye · Zhihua Zhang -
2021 Poster: When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work »
Jiawei Zhang · Yushun Zhang · Mingyi Hong · Ruoyu Sun · Zhi-Quan Luo -
2020 Poster: Towards a Better Global Loss Landscape of GANs »
Ruoyu Sun · Tiantian Fang · Alex Schwing -
2020 Oral: Towards a Better Global Loss Landscape of GANs »
Ruoyu Sun · Tiantian Fang · Alex Schwing -
2020 Poster: A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems »
Jiawei Zhang · Peijun Xiao · Ruoyu Sun · Zhiquan Luo -
2018 Poster: Adding One Neuron Can Eliminate All Bad Local Minima »
SHIYU LIANG · Ruoyu Sun · Jason Lee · R. Srikant -
2014 Poster: Distributed Power-law Graph Computing: Theoretical and Empirical Analysis »
Cong Xie · Ling Yan · Wu-Jun Li · Zhihua Zhang -
2012 Poster: Nonconvex Penalization, Levy Processes and Concave Conjugates »
Zhihua Zhang · Bojun Tu -
2012 Poster: A Scalable CUR Matrix Decomposition Algorithm: Lower Time Complexity and Tighter Bound »
Shusen Wang · Zhihua Zhang -
2009 Poster: Probabilistic Relational PCA »
Wu-Jun Li · Dit-Yan Yeung · Zhihua Zhang -
2009 Spotlight: Probabilistic Relational PCA »
Wu-Jun Li · Dit-Yan Yeung · Zhihua Zhang -
2009 Poster: Optimal Scoring for Unsupervised Learning »
Zhihua Zhang · guang dai -
2008 Poster: Posterior Consistency of the Silverman g-prior in Bayesian Model Choice »
Zhihua Zhang · Michael Jordan · Dit-Yan Yeung -
2008 Spotlight: Posterior Consistency of the Silverman g-prior in Bayesian Model Choice »
Zhihua Zhang · Michael Jordan · Dit-Yan Yeung