Timezone: »
Data-free Knowledge Distillation (DFKD) has attracted attention recently thanks to its appealing capability of transferring knowledge from a teacher network to a student network without using training data. The main idea is to use a generator to synthesize data for training the student. As the generator gets updated, the distribution of synthetic data will change. Such distribution shift could be large if the generator and the student are trained adversarially, causing the student to forget the knowledge it acquired at the previous steps. To alleviate this problem, we propose a simple yet effective method called Momentum Adversarial Distillation (MAD) which maintains an exponential moving average (EMA) copy of the generator and uses synthetic samples from both the generator and the EMA generator to train the student. Since the EMA generator can be considered as an ensemble of the generator's old versions and often undergoes a smaller change in updates compared to the generator, training on its synthetic samples can help the student recall the past knowledge and prevent the student from adapting too quickly to the new updates of the generator. Our experiments on six benchmark datasets including big datasets like ImageNet and Places365 demonstrate the superior performance of MAD over competing methods for handling the large distribution shift problem. Our method also compares favorably to existing DFKD methods and even achieves state-of-the-art results in some cases.
Author Information
Kien Do (Deakin University)
Thai Hung Le (Deakin University)
Dung Nguyen (Deakin University)
Dang Nguyen (Deakin University)
HARIPRIYA HARIKUMAR (Deakin University)
Truyen Tran (Deakin University)
Santu Rana (Deakin University)
Svetha Venkatesh (Deakin University)
More from the Same Authors
-
2021 : Offline neural contextual bandits: Pessimism, Optimization and Generalization »
Thanh Nguyen-Tang · Sunil Gupta · A. Tuan Nguyen · Svetha Venkatesh -
2022 Poster: Learning to Constrain Policy Optimization with Virtual Trust Region »
Thai Hung Le · Thommen Karimpanal George · Majid Abdolshah · Dung Nguyen · Kien Do · Sunil Gupta · Svetha Venkatesh -
2022 Poster: Functional Indirection Neural Estimator for Better Out-of-distribution Generalization »
Kha Pham · Thai Hung Le · Man Ngo · Truyen Tran -
2022 : Time-Evolving Conditional Character-centric Graphs for Movie Understanding »
Long Dang · Thao Le · Vuong Le · Tu Minh Phuong · Truyen Tran -
2022 : Improving Domain Generalization with Interpolation Robustness »
Ragja Palakkadavath · Thanh Nguyen-Tang · Sunil Gupta · Svetha Venkatesh -
2022 : Improving Domain Generalization with Interpolation Robustness »
Ragja Palakkadavath · Thanh Nguyen-Tang · Sunil Gupta · Svetha Venkatesh -
2023 Workshop: Backdoors in Deep Learning: The Good, the Bad, and the Ugly »
Khoa D Doan · Aniruddha Saha · Anh Tran · Yingjie Lao · Kok-Seng Wong · Ang Li · HARIPRIYA HARIKUMAR · Eugene Bagdasaryan · Micah Goldblum · Tom Goldstein -
2022 Spotlight: Lightning Talks 5A-2 »
Qiang LI · Zhiwei Xu · Jia-Qi Yang · Thai Hung Le · Haoxuan Qu · Yang Li · Artyom Sorokin · Peirong Zhang · Mira Finkelstein · Nitsan levy · Chung-Yiu Yau · dapeng li · Thommen Karimpanal George · De-Chuan Zhan · Nazar Buzun · Jiajia Jiang · Li Xu · Yichuan Mo · Yujun Cai · Yuliang Liu · Leonid Pugachev · Bin Zhang · Lucy Liu · Hoi-To Wai · Liangliang Shi · Majid Abdolshah · Yoav Kolumbus · Lin Geng Foo · Junchi Yan · Mikhail Burtsev · Lianwen Jin · Yuan Zhan · Dung Nguyen · David Parkes · Yunpeng Baiia · Jun Liu · Kien Do · Guoliang Fan · Jeffrey S Rosenschein · Sunil Gupta · Sarah Keren · Svetha Venkatesh -
2022 Spotlight: Learning to Constrain Policy Optimization with Virtual Trust Region »
Thai Hung Le · Thommen Karimpanal George · Majid Abdolshah · Dung Nguyen · Kien Do · Sunil Gupta · Svetha Venkatesh -
2022 : Spotlight: Time-Evolving Conditional Character-centric Graphs for Movie Understanding »
Long Dang · Thao Le · Vuong Le · Tu Minh Phuong · Truyen Tran -
2022 Poster: Human-AI Collaborative Bayesian Optimisation »
Arun Kumar A V · Santu Rana · Alistair Shilton · Svetha Venkatesh -
2022 Poster: Expected Improvement for Contextual Bandits »
Hung The Tran · Sunil Gupta · Santu Rana · Tuan Truong · Long Tran-Thanh · Svetha Venkatesh -
2021 Poster: Model-Based Episodic Memory Induces Dynamic Hybrid Controls »
Hung Le · Thommen Karimpanal George · Majid Abdolshah · Truyen Tran · Svetha Venkatesh -
2021 Poster: Kernel Functional Optimisation »
Arun Kumar Anjanapura Venkatesh · Alistair Shilton · Santu Rana · Sunil Gupta · Svetha Venkatesh -
2020 : GEFA: Early Fusion Approach in Drug-Target Affinity Prediction »
Tri Nguyen Minh · Thin Nguyen · Thao M Le · Truyen Tran -
2020 Poster: Sub-linear Regret Bounds for Bayesian Optimisation in Unknown Search Spaces »
Hung The Tran · Sunil Gupta · Santu Rana · Huong Ha · Svetha Venkatesh -
2019 Poster: Bayesian Optimization with Unknown Search Space »
Huong Ha · Santu Rana · Sunil Gupta · Thanh Nguyen-Tang · Hung The Tran · Svetha Venkatesh -
2019 Poster: Multi-objective Bayesian optimisation with preferences over objectives »
Majid Abdolshah · Alistair Shilton · Santu Rana · Sunil Gupta · Svetha Venkatesh -
2018 Poster: Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation »
Shivapratap Gopakumar · Sunil Gupta · Santu Rana · Vu Nguyen · Svetha Venkatesh -
2018 Poster: Variational Memory Encoder-Decoder »
Hung Le · Truyen Tran · Thin Nguyen · Svetha Venkatesh -
2017 Poster: Process-constrained batch Bayesian optimisation »
Pratibha Vellanki · Santu Rana · Sunil Gupta · David Rubin · Alessandra Sutti · Thomas Dorin · Murray Height · Paul Sanders · Svetha Venkatesh -
2017 Spotlight: Process-constrained batch Bayesian optimisation »
Pratibha Vellanki · Santu Rana · Sunil Gupta · David Rubin · Alessandra Sutti · Thomas Dorin · Murray Height · Paul Sanders · Svetha Venkatesh