Timezone: »
Poster
Faster Directional Convergence of Linear Neural Networks under Spherically Symmetric Data
Dachao Lin · Ruoyu Sun · Zhihua Zhang
In this paper, we study gradient methods for training deep linear neural networks with binary cross-entropy loss. In particular, we show global directional convergence guarantees from a polynomial rate to a linear rate for (deep) linear networks with spherically symmetric data distribution, which can be viewed as a specific zero-margin dataset. Our results do not require the assumptions in other works such as small initial loss, presumed convergence of weight direction, or overparameterization. We also characterize our findings in experiments.
Author Information
Dachao Lin (Peking University)
Ruoyu Sun (University of Illinois at Urbana-Champaign)
Zhihua Zhang (Shanghai Jiao Tong University)
More from the Same Authors
-
2021 Poster: Greedy and Random Quasi-Newton Methods with Faster Explicit Superlinear Convergence »
Dachao Lin · Haishan Ye · Zhihua Zhang -
2021 Poster: When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work »
Jiawei Zhang · Yushun Zhang · Mingyi Hong · Ruoyu Sun · Zhi-Quan Luo -
2020 Poster: Towards a Better Global Loss Landscape of GANs »
Ruoyu Sun · Tiantian Fang · Alex Schwing -
2020 Oral: Towards a Better Global Loss Landscape of GANs »
Ruoyu Sun · Tiantian Fang · Alex Schwing -
2020 Poster: A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems »
Jiawei Zhang · Peijun Xiao · Ruoyu Sun · Zhiquan Luo -
2018 Poster: Adding One Neuron Can Eliminate All Bad Local Minima »
SHIYU LIANG · Ruoyu Sun · Jason Lee · R. Srikant -
2014 Poster: Distributed Power-law Graph Computing: Theoretical and Empirical Analysis »
Cong Xie · Ling Yan · Wu-Jun Li · Zhihua Zhang -
2012 Poster: Nonconvex Penalization, Levy Processes and Concave Conjugates »
Zhihua Zhang · Bojun Tu -
2012 Poster: A Scalable CUR Matrix Decomposition Algorithm: Lower Time Complexity and Tighter Bound »
Shusen Wang · Zhihua Zhang -
2009 Poster: Probabilistic Relational PCA »
Wu-Jun Li · Dit-Yan Yeung · Zhihua Zhang -
2009 Spotlight: Probabilistic Relational PCA »
Wu-Jun Li · Dit-Yan Yeung · Zhihua Zhang -
2009 Poster: Optimal Scoring for Unsupervised Learning »
Zhihua Zhang · guang dai -
2008 Poster: Posterior Consistency of the Silverman g-prior in Bayesian Model Choice »
Zhihua Zhang · Michael Jordan · Dit-Yan Yeung -
2008 Spotlight: Posterior Consistency of the Silverman g-prior in Bayesian Model Choice »
Zhihua Zhang · Michael Jordan · Dit-Yan Yeung