Timezone: »
Modern deep neural networks for classification usually jointly learn a backbone for representation and a linear classifier to output the logit of each class. A recent study has shown a phenomenon called neural collapse that the within-class means of features and the classifier vectors converge to the vertices of a simplex equiangular tight frame (ETF) at the terminal phase of training on a balanced dataset. Since the ETF geometric structure maximally separates the pair-wise angles of all classes in the classifier, it is natural to raise the question, why do we spend an effort to learn a classifier when we know its optimal geometric structure? In this paper, we study the potential of learning a neural network for classification with the classifier randomly initialized as an ETF and fixed during training. Our analytical work based on the layer-peeled model indicates that the feature learning with a fixed ETF classifier naturally leads to the neural collapse state even when the dataset is imbalanced among classes. We further show that in this case the cross entropy (CE) loss is not necessary and can be replaced by a simple squared loss that shares the same global optimality but enjoys a better convergence property. Our experimental results show that our method is able to bring significant improvements with faster convergence on multiple imbalanced datasets.
Author Information
Yibo Yang (Looking for research position. Email me!)
https://iboing.github.io/
Shixiang Chen (Texas A&M)
Xiangtai Li (Peking University)
Liang Xie (Zhejiang University)
Zhouchen Lin (Peking University)
Dacheng Tao (University of Technology, Sydney)
More from the Same Authors
-
2021 Spotlight: Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State »
Mingqing Xiao · Qingyan Meng · Zongpeng Zhang · Yisen Wang · Zhouchen Lin -
2021 : AP-10K: A Benchmark for Animal Pose Estimation in the Wild »
Hang Yu · Yufei Xu · Jing Zhang · Wei Zhao · Ziyu Guan · Dacheng Tao -
2022 Poster: Rethinking Knowledge Graph Evaluation Under the Open-World Assumption »
Haotong Yang · Zhouchen Lin · Muhan Zhang -
2022 Poster: Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach »
Peng Mi · Li Shen · Tianhe Ren · Yiyi Zhou · Xiaoshuai Sun · Rongrong Ji · Dacheng Tao -
2022 Poster: ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation »
Yufei Xu · Jing Zhang · Qiming ZHANG · Dacheng Tao -
2022 Poster: APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking »
Yuxiang Yang · Junjie Yang · Yufei Xu · Jing Zhang · Long Lan · Dacheng Tao -
2022 : Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models »
Xingyu Xie · Pan Zhou · Huan Li · Zhouchen Lin · Shuicheng Yan -
2022 Spotlight: Escaping from the Barren Plateau via Gaussian Initializations in Deep Variational Quantum Circuits »
Kaining Zhang · Liu Liu · Min-Hsiu Hsieh · Dacheng Tao -
2022 Spotlight: Lightning Talks 4B-4 »
Ziyue Jiang · Zeeshan Khan · Yuxiang Yang · Chenze Shao · Yichong Leng · Zehao Yu · Wenguan Wang · Xian Liu · Zehua Chen · Yang Feng · Qianyi Wu · James Liang · C.V. Jawahar · Junjie Yang · Zhe Su · Songyou Peng · Yufei Xu · Junliang Guo · Michael Niemeyer · Hang Zhou · Zhou Zhao · Makarand Tapaswi · Dongfang Liu · Qian Yang · Torsten Sattler · Yuanqi Du · Haohe Liu · Jing Zhang · Andreas Geiger · Yi Ren · Long Lan · Jiawei Chen · Wayne Wu · Dahua Lin · Dacheng Tao · Xu Tan · Jinglin Liu · Ziwei Liu · 振辉 叶 · Danilo Mandic · Lei He · Xiangyang Li · Tao Qin · sheng zhao · Tie-Yan Liu -
2022 Spotlight: Lightning Talks 4A-3 »
Zhihan Gao · Yabin Wang · Xingyu Qu · Luziwei Leng · Mingqing Xiao · Bohan Wang · Yu Shen · Zhiwu Huang · Xingjian Shi · Qi Meng · Yupeng Lu · Diyang Li · Qingyan Meng · Kaiwei Che · Yang Li · Hao Wang · Huishuai Zhang · Zongpeng Zhang · Kaixuan Zhang · Xiaopeng Hong · Xiaohan Zhao · Di He · Jianguo Zhang · Yaofeng Tu · Bin Gu · Yi Zhu · Ruoyu Sun · Yuyang (Bernie) Wang · Zhouchen Lin · Qinghu Meng · Wei Chen · Wentao Zhang · Bin CUI · Jie Cheng · Zhi-Ming Ma · Mu Li · Qinghai Guo · Dit-Yan Yeung · Tie-Yan Liu · Jianxing Liao -
2022 Spotlight: Online Training Through Time for Spiking Neural Networks »
Mingqing Xiao · Qingyan Meng · Zongpeng Zhang · Di He · Zhouchen Lin -
2022 Spotlight: APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking »
Yuxiang Yang · Junjie Yang · Yufei Xu · Jing Zhang · Long Lan · Dacheng Tao -
2022 Spotlight: Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach »
Kaiwen Yang · Yanchao Sun · Jiahao Su · Fengxiang He · Xinmei Tian · Furong Huang · Tianyi Zhou · Dacheng Tao -
2022 Poster: CGLB: Benchmark Tasks for Continual Graph Learning »
Xikun Zhang · Dongjin Song · Dacheng Tao -
2022 Poster: Escaping from the Barren Plateau via Gaussian Initializations in Deep Variational Quantum Circuits »
Kaining Zhang · Liu Liu · Min-Hsiu Hsieh · Dacheng Tao -
2022 Poster: Benefits of Permutation-Equivariance in Auction Mechanisms »
Tian Qin · Fengxiang He · Dingfeng Shi · Wenbing Huang · Dacheng Tao -
2022 Poster: Towards Theoretically Inspired Neural Initialization Optimization »
Yibo Yang · Hong Wang · Haobo Yuan · Zhouchen Lin -
2022 Poster: Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach »
Kaiwen Yang · Yanchao Sun · Jiahao Su · Fengxiang He · Xinmei Tian · Furong Huang · Tianyi Zhou · Dacheng Tao -
2022 Poster: Online Training Through Time for Spiking Neural Networks »
Mingqing Xiao · Qingyan Meng · Zongpeng Zhang · Di He · Zhouchen Lin -
2021 Poster: On Training Implicit Models »
Zhengyang Geng · Xin-Yu Zhang · Shaojie Bai · Yisen Wang · Zhouchen Lin -
2021 Poster: Dissecting the Diffusion Process in Linear Graph Convolutional Networks »
Yifei Wang · Yisen Wang · Jiansheng Yang · Zhouchen Lin -
2021 Poster: Class-Disentanglement and Applications in Adversarial Detection and Defense »
Kaiwen Yang · Tianyi Zhou · Yonggang Zhang · Xinmei Tian · Dacheng Tao -
2021 Poster: Gauge Equivariant Transformer »
Lingshen He · Yiming Dong · Yisen Wang · Dacheng Tao · Zhouchen Lin -
2021 Poster: ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias »
Yufei Xu · Qiming ZHANG · Jing Zhang · Dacheng Tao -
2021 Poster: Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State »
Mingqing Xiao · Qingyan Meng · Zongpeng Zhang · Yisen Wang · Zhouchen Lin -
2021 Poster: Efficient Equivariant Network »
Lingshen He · Yuxuan Chen · zhengyang shen · Yiming Dong · Yisen Wang · Zhouchen Lin -
2021 Poster: Residual Relaxation for Multi-view Representation Learning »
Yifei Wang · Zhengyang Geng · Feng Jiang · Chuming Li · Yisen Wang · Jiansheng Yang · Zhouchen Lin -
2020 Poster: ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding »
Yibo Yang · Hongyang Li · Shan You · Fei Wang · Chen Qian · Zhouchen Lin -
2018 Workshop: NIPS 2018 workshop on Compact Deep Neural Networks with industrial applications »
Lixin Fan · Zhouchen Lin · Max Welling · Yurong Chen · Werner Bailer -
2018 Poster: SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator »
Cong Fang · Chris Junchi Li · Zhouchen Lin · Tong Zhang -
2018 Spotlight: SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator »
Cong Fang · Chris Junchi Li · Zhouchen Lin · Tong Zhang -
2018 Poster: Joint Sub-bands Learning with Clique Structures for Wavelet Domain Super-Resolution »
Zhisheng Zhong · Tiancheng Shen · Yibo Yang · Zhouchen Lin · Chao Zhang -
2018 Poster: Learning Versatile Filters for Efficient Convolutional Neural Networks »
Yunhe Wang · Chang Xu · Chunjing XU · Chao Xu · Dacheng Tao -
2017 Poster: Faster and Non-ergodic O(1/K) Stochastic Alternating Direction Method of Multipliers »
Cong Fang · Feng Cheng · Zhouchen Lin -
2015 Poster: Accelerated Proximal Gradient Methods for Nonconvex Programming »
Huan Li · Zhouchen Lin