Timezone: »
Before sufficient training data is available, fine-tuning neural networks pre-trained on large-scale datasets substantially outperforms training from random initialization. However, fine-tuning methods suffer from a dilemma across catastrophic forgetting and negative transfer. While several methods with explicit attempts to overcome catastrophic forgetting have been proposed, negative transfer is rarely delved into. In this paper, we launch an in-depth empirical investigation into negative transfer in fine-tuning and find that, for the weight parameters and feature representations, transferability of their spectral components is diverse. For safe transfer learning, we present Batch Spectral Shrinkage (BSS), a novel regularization approach to penalizing smaller singular values so that untransferable spectral components are suppressed. BSS is orthogonal to existing fine-tuning methods and is readily pluggable into them. Experimental results show that BSS can significantly enhance the performance of state-of-the-art methods, especially in few training data regime.
Author Information
Xinyang Chen (Tsinghua University)
Sinan Wang (Tsinghua University)
Bo Fu (Tsinghua University)
Mingsheng Long (Tsinghua University)
Jianmin Wang (Tsinghua University)
More from the Same Authors
-
2022 Poster: Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models »
Yang Shu · Zhangjie Cao · Ziyang Zhang · Jianmin Wang · Mingsheng Long -
2022 Poster: Supported Policy Optimization for Offline Reinforcement Learning »
Jialong Wu · Haixu Wu · Zihan Qiu · Jianmin Wang · Mingsheng Long -
2022 Poster: Non-stationary Transformers: Exploring the Stationarity in Time Series Forecasting »
Yong Liu · Haixu Wu · Jianmin Wang · Mingsheng Long -
2022 : Domain Adaptation: Theory, Algorithms, and Open Library »
Mingsheng Long -
2022 Poster: Debiased Self-Training for Semi-Supervised Learning »
Baixu Chen · Junguang Jiang · Ximei Wang · Pengfei Wan · Jianmin Wang · Mingsheng Long -
2021 Poster: Cycle Self-Training for Domain Adaptation »
Hong Liu · Jianmin Wang · Mingsheng Long -
2021 Poster: Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting »
Haixu Wu · Jiehui Xu · Jianmin Wang · Mingsheng Long -
2020 Poster: Co-Tuning for Transfer Learning »
Kaichao You · Zhi Kou · Mingsheng Long · Jianmin Wang -
2020 Poster: Transferable Calibration with Lower Bias and Variance in Domain Adaptation »
Ximei Wang · Mingsheng Long · Jianmin Wang · Michael Jordan -
2020 Poster: Stochastic Normalization »
Zhi Kou · Kaichao You · Mingsheng Long · Jianmin Wang -
2020 Poster: Learning to Adapt to Evolving Domains »
Hong Liu · Mingsheng Long · Jianmin Wang · Yu Wang -
2019 Poster: Transferable Normalization: Towards Improving Transferability of Deep Neural Networks »
Ximei Wang · Ying Jin · Mingsheng Long · Jianmin Wang · Michael Jordan -
2018 Poster: Conditional Adversarial Domain Adaptation »
Mingsheng Long · ZHANGJIE CAO · Jianmin Wang · Michael Jordan -
2018 Poster: Generalized Zero-Shot Learning with Deep Calibration Network »
Shichen Liu · Mingsheng Long · Jianmin Wang · Michael Jordan -
2017 Poster: PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs »
Yunbo Wang · Mingsheng Long · Jianmin Wang · Zhifeng Gao · Philip S Yu -
2017 Poster: Learning Multiple Tasks with Multilinear Relationship Networks »
Mingsheng Long · ZHANGJIE CAO · Jianmin Wang · Philip S Yu -
2016 Poster: Unsupervised Domain Adaptation with Residual Transfer Networks »
Mingsheng Long · Han Zhu · Jianmin Wang · Michael Jordan -
2015 Workshop: Transfer and Multi-Task Learning: Trends and New Perspectives »
Anastasia Pentina · Christoph Lampert · Sinno Jialin Pan · Mingsheng Long · Judy Hoffman · Baochen Sun · Kate Saenko