Timezone: »

 
Poster
Back Razor: Memory-Efficient Transfer Learning by Self-Sparsified Backpropogation
Ziyu Jiang · Xuxi Chen · Xueqin Huang · Xianzhi Du · Denny Zhou · Zhangyang Wang

Thu Dec 01 09:00 AM -- 11:00 AM (PST) @ Hall J #440

Transfer learning from the model trained on large datasets to customized downstream tasks has been widely used as the pre-trained model can greatly boost the generalizability. However, the increasing sizes of pre-trained models also lead to a prohibitively large memory footprints for downstream transferring, making them unaffordable for personal devices. Previous work recognizes the bottleneck of the footprint to be the activation, and hence proposes various solutions such as injecting specific lite modules. In this work, we present a novel memory-efficient transfer framework called Back Razor, that can be plug-and-play applied to any pre-trained network without changing its architecture. The key idea of Back Razor is asymmetric sparsifying: pruning the activation stored for back-propagation, while keeping the forward activation dense. It is based on the observation that the stored activation, that dominates the memory footprint, is only needed for backpropagation. Such asymmetric pruning avoids affecting the precision of forward computation, thus making more aggressive pruning possible. Furthermore, we conduct the theoretical analysis for the convergence rate of Back Razor, showing that under mild conditions, our method retains the similar convergence rate as vanilla SGD. Extensive transfer learning experiments on both Convolutional Neural Networks and Vision Transformers show that Back Razor could yield up to 97% sparsity, saving 9.2x memory usage, without losing accuracy. The code is available at: https://github.com/VITA-Group/BackRazor_Neurips22.

Author Information

Ziyu Jiang (Texas A&M University)
Xuxi Chen (University of Texas at Austin)
Xueqin Huang (Texas A&M University - College Station)
Xianzhi Du (Google)
Denny Zhou (Google)
Zhangyang Wang (University of Texas at Austin)

More from the Same Authors