Timezone: »

How to Characterize The Landscape of Overparameterized Convolutional Neural Networks
Yihong Gu · Weizhong Zhang · Cong Fang · Jason Lee · Tong Zhang

Mon Dec 07 09:00 PM -- 11:00 PM (PST) @ Poster Session 0 #135

For many initialization schemes, parameters of two randomly initialized deep neural networks (DNNs) can be quite different, but feature distributions of the hidden nodes are similar at each layer. With the help of a new technique called {\it neural network grafting}, we demonstrate that even during the entire training process, feature distributions of differently initialized networks remain similar at each layer. In this paper, we present an explanation of this phenomenon. Specifically, we consider the loss landscape of an overparameterized convolutional neural network (CNN) in the continuous limit, where the numbers of channels/hidden nodes in the hidden layers go to infinity. Although the landscape of the overparameterized CNN is still non-convex with respect to the trainable parameters, we show that very surprisingly, it can be reformulated as a convex function with respect to the feature distributions in the hidden layers. Therefore by reparameterizing neural networks in terms of feature distributions, we obtain a much simpler characterization of the landscape of overparameterized CNNs. We further argue that training with respect to network parameters leads to a fixed trajectory in the feature distributions.

Author Information

Yihong Gu (Princeton University)
Weizhong Zhang (HKUST)
Cong Fang (Peking University)
Jason Lee (Princeton University)
Tong Zhang (The Hong Kong University of Science and Technology)

More from the Same Authors