Timezone: »
Normalization techniques play an important role in supporting efficient and often more effective training of deep neural networks. While conventional methods explicitly normalize the activations, we suggest to add a loss term instead. This new loss term encourages the variance of the activations to be stable and not vary from one random mini-batch to the next. As we prove, this encourages the activations to be distributed around a few distinct modes. We also show that if the inputs are from a mixture of two Gaussians, the new loss would either join the two together, or separate between them optimally in the LDA sense, depending on the prior probabilities. Finally, we are able to link the new regularization term to the batchnorm method, which provides it with a regularization perspective. Our experiments demonstrate an improvement in accuracy over the batchnorm technique for both CNNs and fully connected networks.
Author Information
Etai Littwin (Apple)
Lior Wolf (Facebook AI Research)
More from the Same Authors
-
2022 : The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the \emph{Grokking Phenomenon} »
Vimal Thilak · Etai Littwin · Shuangfei Zhai · Omid Saremi · Roni Paiss · Joshua Susskind -
2022 Poster: What is Where by Looking: Weakly-Supervised Open-World Phrase-Grounding without Text Inputs »
Tal Shaharabany · Yoad Tewel · Lior Wolf -
2022 Poster: Error Correction Code Transformer »
Yoni Choukroun · Lior Wolf -
2022 Poster: Optimizing Relevance Maps of Vision Transformers Improves Robustness »
Hila Chefer · Idan Schwartz · Lior Wolf -
2021 Poster: Meta Internal Learning »
Raphael Bensadoun · Shir Gur · Tomer Galanti · Lior Wolf -
2020 Poster: Generating Correct Answers for Progressive Matrices Intelligence Tests »
Niv Pekar · Yaniv Benny · Lior Wolf -
2020 Poster: Hierarchical Patch VAE-GAN: Generating Diverse Videos from a Single Sample »
Shir Gur · Sagie Benaim · Lior Wolf -
2020 Poster: On the Modularity of Hypernetworks »
Tomer Galanti · Lior Wolf -
2020 Poster: On Infinite-Width Hypernetworks »
Etai Littwin · Tomer Galanti · Lior Wolf · Greg Yang -
2020 Oral: On the Modularity of Hypernetworks »
Tomer Galanti · Lior Wolf -
2019 Poster: Hyper-Graph-Network Decoders for Block Codes »
Eliya Nachmani · Lior Wolf -
2018 Poster: Automatic Program Synthesis of Long Programs with a Learned Garbage Collector »
Amit Zohar · Lior Wolf -
2018 Poster: One-Shot Unsupervised Cross Domain Translation »
Sagie Benaim · Lior Wolf -
2017 Poster: One-Sided Unsupervised Domain Mapping »
Sagie Benaim · Lior Wolf -
2017 Spotlight: One-Sided Unsupervised Domain Mapping »
Sagie Benaim · Lior Wolf