Timezone: »

Faster Directional Convergence of Linear Neural Networks under Spherically Symmetric Data
Dachao Lin · Ruoyu Sun · Zhihua Zhang

Wed Dec 08 12:30 AM -- 02:00 AM (PST) @

In this paper, we study gradient methods for training deep linear neural networks with binary cross-entropy loss. In particular, we show global directional convergence guarantees from a polynomial rate to a linear rate for (deep) linear networks with spherically symmetric data distribution, which can be viewed as a specific zero-margin dataset. Our results do not require the assumptions in other works such as small initial loss, presumed convergence of weight direction, or overparameterization. We also characterize our findings in experiments.

Author Information

Dachao Lin (Peking University)
Ruoyu Sun (University of Illinois at Urbana-Champaign)
Zhihua Zhang (Shanghai Jiao Tong University)

More from the Same Authors