NeurIPS 2020 : Is normalization indispensable for training deep neural network?



Is normalization indispensable for training deep neural network?

Jie Shao, Kai Hu, Changhu Wang, Xiangyang Xue, Bhiksha Raj

Oral presentation: Orals & Spotlights Track 34: Deep Learning
on 2020-12-10T18:00:00-08:00 - 2020-12-10T18:15:00-08:00

Poster Session 7 (more posters)
on 2020-12-10T21:00:00-08:00 - 2020-12-10T23:00:00-08:00

Toggle Abstract Paper (in Proceedings / .pdf)

Abstract: Normalization operations are widely used to train deep neural networks, and they can improve both convergence and generalization in most tasks. The theories for normalization's effectiveness and new forms of normalization have always been hot topics in research. To better understand normalization, one question can be whether normalization is indispensable for training deep neural network? In this paper, we study what would happen when normalization layers are removed from the network, and show how to train deep neural networks without normalization layers and without performance degradation. Our proposed method can achieve the same or even slightly better performance in a variety of tasks: image classification in ImageNet, object detection and segmentation in MS-COCO, video classification in Kinetics, and machine translation in WMT English-German, etc. Our study may help better understand the role of normalization layers and can be a competitive alternative to normalization layers. Codes are available.

Is normalization indispensable for training deep neural network?

Jie Shao, Kai Hu, Changhu Wang, Xiangyang Xue, Bhiksha Raj

Preview Video and Chat

To see video, interact with the author and ask questions please use registration and login.