Poster

Exact Solutions of a Deep Linear Network

Liu Ziyin · Botao Li · Xiangming Meng

Hall J #515

Keywords: [ Exact Solution ] [ Collapse ] [ deep linear network ]

[ Abstract ]
[ Paper [ OpenReview
Thu 1 Dec 9 a.m. PST — 11 a.m. PST

Abstract: This work finds the analytical expression of the global minima of a deep linear network with weight decay and stochastic neurons, a fundamental model for understanding the landscape of neural networks. Our result implies that zero is a special point in deep neural network architecture. We show that weight decay strongly interacts with the model architecture and can create bad minima at zero in a network with more than $1$ hidden layer, qualitatively different from a network with only $1$ hidden layer. Practically, our result implies that common deep learning initialization methods are insufficient to ease the optimization of neural networks in general.

Chat is not available.