Timezone: »
Unlike existing knowledge distillation methods focus on the baseline settings, where the teacher models and training strategies are not that strong and competing as state-of-the-art approaches, this paper presents a method dubbed DIST to distill better from a stronger teacher. We empirically find that the discrepancy of predictions between the student and a stronger teacher may tend to be fairly severer. As a result, the exact match of predictions in KL divergence would disturb the training and make existing methods perform poorly. In this paper, we show that simply preserving the relations between the predictions of teacher and student would suffice, and propose a correlation-based loss to capture the intrinsic inter-class relations from the teacher explicitly. Besides, considering that different instances have different semantic similarities to each class, we also extend this relational match to the intra-class level. Our method is simple yet practical, and extensive experiments demonstrate that it adapts well to various architectures, model sizes and training strategies, and can achieve state-of-the-art performance consistently on image classification, object detection, and semantic segmentation tasks. Code is available at: https://github.com/hunto/DIST_KD.
Author Information
Tao Huang (The University of Sydney)
Shan You (SenseTime Research)
Fei Wang (Sensetime)
Chen Qian (SenseTime)
Chang Xu (University of Sydney)
More from the Same Authors
-
2020 Meetup: MeetUp: Sydney Australia »
Chang Xu -
2021 Meetup: Sydney, Australia »
Chang Xu -
2022 Poster: Weak-shot Semantic Segmentation via Dual Similarity Transfer »
Junjie Chen · Li Niu · Siyuan Zhou · Jianlou Si · Chen Qian · Liqing Zhang -
2023 Poster: Stable Diffusion is Unstable »
Chengbin Du · Yanxi Li · Zhongwei Qiu · Chang Xu -
2023 Poster: Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundation Models »
Yichao Cao · Qingfei Tang · Xiu Su · Song Chen · Shan You · Chang Xu · Xiaobo Lu -
2023 Poster: Beyond Pretrained Features: Noisy Image Modeling Provides Adversarial Defense »
Zunzhi You · Daochang Liu · Bohyung Han · Chang Xu -
2023 Poster: Rethinking Conditional Diffusion Sampling with Progressive Guidance »
Anh-Dung Dinh · Daochang Liu · Chang Xu -
2023 Poster: Contrastive Sampling Chains in Diffusion Models »
Junyu Zhang · Daochang Liu · Shichao Zhang · Chang Xu -
2023 Poster: Adversarial Robustness through Random Weight Sampling »
Yanxiang Ma · Minjing Dong · Chang Xu -
2023 Poster: One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation »
Zhiwei Hao · Jianyuan Guo · Kai Han · Yehui Tang · Han Hu · Yunhe Wang · Chang Xu -
2023 Poster: Revisit the Power of Vanilla Knowledge Distillation: from Small Scale to Large Scale »
Zhiwei Hao · Jianyuan Guo · Kai Han · Han Hu · Chang Xu · Yunhe Wang -
2023 Poster: Knowledge Diffusion for Distillation »
Tao Huang · Yuan Zhang · Mingkai Zheng · Shan You · Fei Wang · Chen Qian · Chang Xu -
2023 Poster: RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars »
Dongwei Pan · Long Zhuo · Jingtan Piao · Huiwen Luo · Wei Cheng · Yuxin WANG · Siming Fan · Shengqi Liu · Lei Yang · Bo Dai · Ziwei Liu · Chen Change Loy · Chen Qian · Wayne Wu · Dahua Lin · Kwan-Yee Lin -
2022 Spotlight: Lightning Talks 6B-4 »
Junjie Chen · Chuanxia Zheng · JINLONG LI · Yu Shi · Shichao Kan · Yu Wang · FermÃn Travi · Ninh Pham · Lei Chai · Guobing Gan · Tung-Long Vuong · Gonzalo Ruarte · Tao Liu · Li Niu · Jingjing Zou · Zequn Jie · Peng Zhang · Ming LI · Yixiong Liang · Guolin Ke · Jianfei Cai · Gaston Bujia · Sunzhu Li · Siyuan Zhou · Jingyang Lin · Xu Wang · Min Li · Zhuoming Chen · Qing Ling · Xiaolin Wei · Xiuqing Lu · Shuxin Zheng · Dinh Phung · Yigang Cen · Jianlou Si · Juan Esteban Kamienkowski · Jianxin Wang · Chen Qian · Lin Ma · Benyou Wang · Yingwei Pan · Tie-Yan Liu · Liqing Zhang · Zhihai He · Ting Yao · Tao Mei -
2022 Spotlight: Weak-shot Semantic Segmentation via Dual Similarity Transfer »
Junjie Chen · Li Niu · Siyuan Zhou · Jianlou Si · Chen Qian · Liqing Zhang -
2022 Spotlight: GhostNetV2: Enhance Cheap Operation with Long-Range Attention »
Yehui Tang · Kai Han · Jianyuan Guo · Chang Xu · Chao Xu · Yunhe Wang -
2022 Spotlight: Lightning Talks 2B-1 »
Yehui Tang · Jian Wang · Zheng Chen · man zhou · Peng Gao · Chenyang Si · SHANGKUN SUN · Yixing Xu · Weihao Yu · Xinghao Chen · Kai Han · Hu Yu · Yulun Zhang · Chenhui Gou · Teli Ma · Yuanqi Chen · Yunhe Wang · Hongsheng Li · Jinjin Gu · Jianyuan Guo · Qiman Wu · Pan Zhou · Yu Zhu · Jie Huang · Chang Xu · Yichen Zhou · Haocheng Feng · Guodong Guo · yongbing zhang · Ziyi Lin · Feng Zhao · Ge Li · Junyu Han · Jinwei Gu · Jifeng Dai · Chao Xu · Xinchao Wang · Linghe Kong · Shuicheng Yan · Yu Qiao · Chen Change Loy · Xin Yuan · Errui Ding · Yunhe Wang · Deyu Meng · Jingdong Wang · Chongyi Li -
2022 Poster: GhostNetV2: Enhance Cheap Operation with Long-Range Attention »
Yehui Tang · Kai Han · Jianyuan Guo · Chang Xu · Chao Xu · Yunhe Wang -
2022 Poster: Green Hierarchical Vision Transformer for Masked Image Modeling »
Lang Huang · Shan You · Mingkai Zheng · Fei Wang · Chen Qian · Toshihiko Yamasaki -
2022 Poster: Searching for Better Spatio-temporal Alignment in Few-Shot Action Recognition »
Yichao Cao · Xiu Su · Qingfei Tang · Shan You · Xiaobo Lu · Chang Xu -
2022 Poster: Random Normalization Aggregation for Adversarial Defense »
Minjing Dong · Xinghao Chen · Yunhe Wang · Chang Xu -
2021 Poster: ReSSL: Relational Self-Supervised Learning with Weak Augmentation »
Mingkai Zheng · Shan You · Fei Wang · Chen Qian · Changshui Zhang · Xiaogang Wang · Chang Xu -
2020 Poster: SCOP: Scientific Control for Reliable Neural Network Pruning »
Yehui Tang · Yunhe Wang · Yixing Xu · Dacheng Tao · Chunjing XU · Chao Xu · Chang Xu -
2020 Poster: Kernel Based Progressive Distillation for Adder Neural Networks »
Yixing Xu · Chang Xu · Xinghao Chen · Wei Zhang · Chunjing XU · Yunhe Wang -
2020 Poster: Adapting Neural Architectures Between Domains »
Yanxi Li · Zhaohui Yang · Yunhe Wang · Chang Xu -
2020 Poster: Agree to Disagree: Adaptive Ensemble Knowledge Distillation in Gradient Space »
Shangchen Du · Shan You · Xiaojie Li · Jianlong Wu · Fei Wang · Chen Qian · Changshui Zhang -
2020 Spotlight: Kernel Based Progressive Distillation for Adder Neural Networks »
Yixing Xu · Chang Xu · Xinghao Chen · Wei Zhang · Chunjing XU · Yunhe Wang -
2020 Poster: AOT: Appearance Optimal Transport Based Identity Swapping for Forgery Detection »
Hao Zhu · Chaoyou Fu · Qianyi Wu · Wayne Wu · Chen Qian · Ran He -
2020 Poster: ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding »
Yibo Yang · Hongyang Li · Shan You · Fei Wang · Chen Qian · Zhouchen Lin -
2020 Poster: UnModNet: Learning to Unwrap a Modulo Image for High Dynamic Range Imaging »
Chu Zhou · Hang Zhao · Jin Han · Chang Xu · Chao Xu · Tiejun Huang · Boxin Shi -
2020 Poster: Searching for Low-Bit Weights in Quantized Neural Networks »
Zhaohui Yang · Yunhe Wang · Kai Han · Chunjing XU · Chao Xu · Dacheng Tao · Chang Xu -
2019 Poster: Positive-Unlabeled Compression on the Cloud »
Yixing Xu · Yunhe Wang · Hanting Chen · Kai Han · Chunjing XU · Dacheng Tao · Chang Xu -
2019 Poster: Learning from Bad Data via Generation »
Tianyu Guo · Chang Xu · Boxin Shi · Chao Xu · Dacheng Tao