Timezone: »
The representation gap between teacher and student is an emerging topic in knowledge distillation (KD). To reduce the gap and improve the performance, current methods often resort to complicated training schemes, loss functions, and feature alignments, which are task-specific and feature-specific. In this paper, we state that the essence of these methods is to discard the noisy information and distill the valuable information in the feature, and propose a novel KD method dubbed DiffKD, to explicitly denoise and match features using diffusion models. Our approach is based on the observation that student features typically contain more noises than teacher features due to the smaller capacity of student model. To address this, we propose to denoise student features using a diffusion model trained by teacher features. This allows us to perform better distillation between the refined clean feature and teacher feature. Additionally, we introduce a light-weight diffusion model with a linear autoencoder to reduce the computation cost and an adaptive noise matching module to improve the denoising performance. Extensive experiments demonstrate that DiffKD is effective across various types of features and achieves state-of-the-art performance consistently on image classification, object detection, and semantic segmentation tasks. Code is available at https://github.com/hunto/DiffKD.
Author Information
Tao Huang (The University of Sydney)
I am a Ph.D. student at The University of Sydney, advised by Dr. Chang Xu. Before that, I was a Researcher at SenseTime. I am actively doing research in the field of efficient deep learning, including knowledge distillation, neural architecture search, structural pruning, e.t.c.
Yuan Zhang (Peking University)
Mingkai Zheng (University of Sydney)
Shan You (SenseTime Research)
Fei Wang (Sensetime)
Chen Qian (SenseTime)
Chang Xu (University of Sydney)
More from the Same Authors
-
2020 Meetup: MeetUp: Sydney Australia »
Chang Xu -
2021 Meetup: Sydney, Australia »
Chang Xu -
2022 Poster: Weak-shot Semantic Segmentation via Dual Similarity Transfer »
Junjie Chen · Li Niu · Siyuan Zhou · Jianlou Si · Chen Qian · Liqing Zhang -
2023 Poster: Stable Diffusion is Unstable »
Chengbin Du · Yanxi Li · Zhongwei Qiu · Chang Xu -
2023 Poster: Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundation Models »
Yichao Cao · Qingfei Tang · Xiu Su · Song Chen · Shan You · Xiaobo Lu · Chang Xu -
2023 Poster: Beyond Pretrained Features: Noisy Image Modeling Provides Adversarial Defense »
Zunzhi You · Daochang Liu · Bohyung Han · Chang Xu -
2023 Poster: Contrastive Sampling Chains in Diffusion Models »
Junyu Zhang · Daochang Liu · Shichao Zhang · Chang Xu -
2023 Poster: Adversarial Robustness through Random Weight Sampling »
Yanxiang Ma · Minjing Dong · Chang Xu -
2023 Poster: One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation »
Zhiwei Hao · Jianyuan Guo · Kai Han · Yehui Tang · Han Hu · Yunhe Wang · Chang Xu -
2023 Poster: Revisit the Power of Vanilla Knowledge Distillation: from Small Scale to Large Scale »
Zhiwei Hao · Jianyuan Guo · Kai Han · Han Hu · Chang Xu · Yunhe Wang -
2023 Poster: Rethinking Conditional Diffusion Sampling with Progressive Guidance »
Anh-Dung Dinh · Daochang Liu · Chang Xu -
2023 Poster: RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars »
Dongwei Pan · Long Zhuo · Jingtan Piao · Huiwen Luo · Wei Cheng · Yuxin WANG · Siming Fan · Shengqi Liu · Lei Yang · Bo Dai · Ziwei Liu · Chen Change Loy · Chen Qian · Wayne Wu · Dahua Lin · Kwan-Yee Lin -
2022 Spotlight: Lightning Talks 6B-4 »
Junjie Chen · Chuanxia Zheng · JINLONG LI · Yu Shi · Shichao Kan · Yu Wang · FermÃn Travi · Ninh Pham · Lei Chai · Guobing Gan · Tung-Long Vuong · Gonzalo Ruarte · Tao Liu · Li Niu · Jingjing Zou · Zequn Jie · Peng Zhang · Ming LI · Yixiong Liang · Guolin Ke · Jianfei Cai · Gaston Bujia · Sunzhu Li · Siyuan Zhou · Jingyang Lin · Xu Wang · Min Li · Zhuoming Chen · Qing Ling · Xiaolin Wei · Xiuqing Lu · Shuxin Zheng · Dinh Phung · Yigang Cen · Jianlou Si · Juan Esteban Kamienkowski · Jianxin Wang · Chen Qian · Lin Ma · Benyou Wang · Yingwei Pan · Tie-Yan Liu · Liqing Zhang · Zhihai He · Ting Yao · Tao Mei -
2022 Spotlight: Weak-shot Semantic Segmentation via Dual Similarity Transfer »
Junjie Chen · Li Niu · Siyuan Zhou · Jianlou Si · Chen Qian · Liqing Zhang -
2022 Spotlight: GhostNetV2: Enhance Cheap Operation with Long-Range Attention »
Yehui Tang · Kai Han · Jianyuan Guo · Chang Xu · Chao Xu · Yunhe Wang -
2022 Spotlight: Lightning Talks 2B-1 »
Yehui Tang · Jian Wang · Zheng Chen · man zhou · Peng Gao · Chenyang Si · SHANGKUN SUN · Yixing Xu · Weihao Yu · Xinghao Chen · Kai Han · Hu Yu · Yulun Zhang · Chenhui Gou · Teli Ma · Yuanqi Chen · Yunhe Wang · Hongsheng Li · Jinjin Gu · Jianyuan Guo · Qiman Wu · Pan Zhou · Yu Zhu · Jie Huang · Chang Xu · Yichen Zhou · Haocheng Feng · Guodong Guo · yongbing zhang · Ziyi Lin · Feng Zhao · Ge Li · Junyu Han · Jinwei Gu · Jifeng Dai · Chao Xu · Xinchao Wang · Linghe Kong · Shuicheng Yan · Yu Qiao · Chen Change Loy · Xin Yuan · Errui Ding · Yunhe Wang · Deyu Meng · Jingdong Wang · Chongyi Li -
2022 Poster: Knowledge Distillation from A Stronger Teacher »
Tao Huang · Shan You · Fei Wang · Chen Qian · Chang Xu -
2022 Poster: GhostNetV2: Enhance Cheap Operation with Long-Range Attention »
Yehui Tang · Kai Han · Jianyuan Guo · Chang Xu · Chao Xu · Yunhe Wang -
2022 Poster: Green Hierarchical Vision Transformer for Masked Image Modeling »
Lang Huang · Shan You · Mingkai Zheng · Fei Wang · Chen Qian · Toshihiko Yamasaki -
2022 Poster: Searching for Better Spatio-temporal Alignment in Few-Shot Action Recognition »
Yichao Cao · Xiu Su · Qingfei Tang · Shan You · Xiaobo Lu · Chang Xu -
2022 Poster: Random Normalization Aggregation for Adversarial Defense »
Minjing Dong · Xinghao Chen · Yunhe Wang · Chang Xu -
2021 Poster: ReSSL: Relational Self-Supervised Learning with Weak Augmentation »
Mingkai Zheng · Shan You · Fei Wang · Chen Qian · Changshui Zhang · Xiaogang Wang · Chang Xu -
2020 Poster: SCOP: Scientific Control for Reliable Neural Network Pruning »
Yehui Tang · Yunhe Wang · Yixing Xu · Dacheng Tao · Chunjing XU · Chao Xu · Chang Xu -
2020 Poster: Kernel Based Progressive Distillation for Adder Neural Networks »
Yixing Xu · Chang Xu · Xinghao Chen · Wei Zhang · Chunjing XU · Yunhe Wang -
2020 Poster: Adapting Neural Architectures Between Domains »
Yanxi Li · Zhaohui Yang · Yunhe Wang · Chang Xu -
2020 Poster: Agree to Disagree: Adaptive Ensemble Knowledge Distillation in Gradient Space »
Shangchen Du · Shan You · Xiaojie Li · Jianlong Wu · Fei Wang · Chen Qian · Changshui Zhang -
2020 Spotlight: Kernel Based Progressive Distillation for Adder Neural Networks »
Yixing Xu · Chang Xu · Xinghao Chen · Wei Zhang · Chunjing XU · Yunhe Wang -
2020 Poster: AOT: Appearance Optimal Transport Based Identity Swapping for Forgery Detection »
Hao Zhu · Chaoyou Fu · Qianyi Wu · Wayne Wu · Chen Qian · Ran He -
2020 Poster: ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding »
Yibo Yang · Hongyang Li · Shan You · Fei Wang · Chen Qian · Zhouchen Lin -
2020 Poster: UnModNet: Learning to Unwrap a Modulo Image for High Dynamic Range Imaging »
Chu Zhou · Hang Zhao · Jin Han · Chang Xu · Chao Xu · Tiejun Huang · Boxin Shi -
2020 Poster: Searching for Low-Bit Weights in Quantized Neural Networks »
Zhaohui Yang · Yunhe Wang · Kai Han · Chunjing XU · Chao Xu · Dacheng Tao · Chang Xu -
2019 Poster: Positive-Unlabeled Compression on the Cloud »
Yixing Xu · Yunhe Wang · Hanting Chen · Kai Han · Chunjing XU · Dacheng Tao · Chang Xu -
2019 Poster: Learning from Bad Data via Generation »
Tianyu Guo · Chang Xu · Boxin Shi · Chao Xu · Dacheng Tao