Timezone: »
The fine-tuning of pre-trained language models has a great success in many NLP fields. Yet, it is strikingly vulnerable to adversarial examples, e.g., word substitution attacks using only synonyms can easily fool a BERT-based sentiment analysis model. In this paper, we demonstrate that adversarial training, the prevalent defense technique, does not directly fit a conventional fine-tuning scenario, because it suffers severely from catastrophic forgetting: failing to retain the generic and robust linguistic features that have already been captured by the pre-trained model. In this light, we propose Robust Informative Fine-Tuning (RIFT), a novel adversarial fine-tuning method from an information-theoretical perspective. In particular, RIFT encourages an objective model to retain the features learned from the pre-trained model throughout the entire fine-tuning process, whereas a conventional one only uses the pre-trained weights for initialization. Experimental results show that RIFT consistently outperforms the state-of-the-arts on two popular NLP tasks: sentiment analysis and natural language inference, under different attacks across various pre-trained language models.
Author Information
Xinshuai Dong (Nanyang Technological University)
Anh Tuan Luu (Nanyang Technological University, Singapore)
Min Lin (MILA)
Shuicheng Yan (National University of Singapore)
Hanwang Zhang (NTU)
More from the Same Authors
-
2021 Spotlight: Self-Supervised Learning Disentangled Group Representation as Feature »
Tan Wang · Zhongqi Yue · Jianqiang Huang · Qianru Sun · Hanwang Zhang -
2022 Poster: Respecting Transfer Gap in Knowledge Distillation »
Yulei Niu · Long Chen · Chang Zhou · Hanwang Zhang -
2022 : Mutual Information Regularized Offline Reinforcement Learning »
Xiao Ma · Bingyi Kang · Zhongwen Xu · Min Lin · Shuicheng Yan -
2022 : HloEnv: A Graph Rewrite Environment for Deep Learning Compiler Optimization Research »
Chin Yang Oh · Kunhao Zheng · Bingyi Kang · Xinyi Wan · Zhongwen Xu · Shuicheng Yan · Min Lin · Yangzihao Wang -
2022 Poster: Long Range Graph Benchmark »
Vijay Prakash Dwivedi · Ladislav Rampášek · Michael Galkin · Ali Parviz · Guy Wolf · Anh Tuan Luu · Dominique Beaini -
2022 Poster: Recipe for a General, Powerful, Scalable Graph Transformer »
Ladislav Rampášek · Michael Galkin · Vijay Prakash Dwivedi · Anh Tuan Luu · Guy Wolf · Dominique Beaini -
2022 Poster: EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine »
Jiayi Weng · Min Lin · Shengyi Huang · Bo Liu · Denys Makoviichuk · Viktor Makoviychuk · Zichen Liu · Yufan Song · Ting Luo · Yukun Jiang · Zhongwen Xu · Shuicheng Yan -
2021 Poster: Contrastive Learning for Neural Topic Model »
Thong Nguyen · Anh Tuan Luu -
2021 Poster: Self-Supervised Learning Disentangled Group Representation as Feature »
Tan Wang · Zhongqi Yue · Jianqiang Huang · Qianru Sun · Hanwang Zhang -
2021 Poster: Towards Understanding Why Lookahead Generalizes Better Than SGD and Beyond »
Pan Zhou · Hanshu Yan · Xiaotong Yuan · Jiashi Feng · Shuicheng Yan -
2021 Poster: Direct Multi-view Multi-person 3D Pose Estimation »
tao wang · Jianfeng Zhang · Yujun Cai · Shuicheng Yan · Jiashi Feng -
2021 Poster: Introspective Distillation for Robust Question Answering »
Yulei Niu · Hanwang Zhang -
2020 Poster: Online Fast Adaptation and Knowledge Accumulation (OSAKA): a New Approach to Continual Learning »
Massimo Caccia · Pau Rodriguez · Oleksiy Ostapenko · Fabrice Normandin · Min Lin · Lucas Page-Caccia · Issam Hadj Laradji · Irina Rish · Alexandre Lacoste · David Vázquez · Laurent Charlin -
2020 Poster: Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect »
Kaihua Tang · Jianqiang Huang · Hanwang Zhang -
2020 Poster: Causal Intervention for Weakly-Supervised Semantic Segmentation »
Dong Zhang · Hanwang Zhang · Jinhui Tang · Xian-Sheng Hua · Qianru Sun -
2020 Oral: Causal Intervention for Weakly-Supervised Semantic Segmentation »
Dong Zhang · Hanwang Zhang · Jinhui Tang · Xian-Sheng Hua · Qianru Sun -
2020 Poster: ConvBERT: Improving BERT with Span-based Dynamic Convolution »
Zi-Hang Jiang · Weihao Yu · Daquan Zhou · Yunpeng Chen · Jiashi Feng · Shuicheng Yan -
2020 Poster: Interventional Few-Shot Learning »
Zhongqi Yue · Hanwang Zhang · Qianru Sun · Xian-Sheng Hua -
2020 Spotlight: ConvBERT: Improving BERT with Span-based Dynamic Convolution »
Zi-Hang Jiang · Weihao Yu · Daquan Zhou · Yunpeng Chen · Jiashi Feng · Shuicheng Yan -
2019 Poster: Online Continual Learning with Maximal Interfered Retrieval »
Rahaf Aljundi · Eugene Belilovsky · Tinne Tuytelaars · Laurent Charlin · Massimo Caccia · Min Lin · Lucas Page-Caccia -
2019 Poster: Compositional De-Attention Networks »
Yi Tay · Anh Tuan Luu · Aston Zhang · Shuohang Wang · Siu Cheung Hui -
2019 Poster: Gradient based sample selection for online continual learning »
Rahaf Aljundi · Min Lin · Baptiste Goujaud · Yoshua Bengio -
2019 Poster: Efficient Meta Learning via Minibatch Proximal Update »
Pan Zhou · Xiaotong Yuan · Huan Xu · Shuicheng Yan · Jiashi Feng -
2019 Spotlight: Efficient Meta Learning via Minibatch Proximal Update »
Pan Zhou · Xiaotong Yuan · Huan Xu · Shuicheng Yan · Jiashi Feng -
2018 Poster: A^2-Nets: Double Attention Networks »
Yunpeng Chen · Yannis Kalantidis · Jianshu Li · Shuicheng Yan · Jiashi Feng -
2018 Poster: Low-shot Learning via Covariance-Preserving Adversarial Augmentation Networks »
Hang Gao · Zheng Shou · Alireza Zareian · Hanwang Zhang · Shih-Fu Chang -
2017 Poster: Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis »
Jian Zhao · Lin Xiong · Panasonic Karlekar Jayashree · Jianshu Li · Fang Zhao · Zhecan Wang · Panasonic Sugiri Pranata · Panasonic Shengmei Shen · Shuicheng Yan · Jiashi Feng -
2016 Poster: Tree-Structured Reinforcement Learning for Sequential Object Localization »
Zequn Jie · Xiaodan Liang · Jiashi Feng · Xiaojie Jin · Wen Lu · Shuicheng Yan -
2014 Poster: Robust Logistic Regression and Classification »
Jiashi Feng · Huan Xu · Shie Mannor · Shuicheng Yan -
2014 Poster: Convex Optimization Procedure for Clustering: Theoretical Revisit »
Changbo Zhu · Huan Xu · Chenlei Leng · Shuicheng Yan -
2014 Poster: On a Theory of Nonparametric Pairwise Similarity for Clustering: Connecting Clustering to Classification »
Yingzhen Yang · Feng Liang · Shuicheng Yan · Zhangyang Wang · Thomas S Huang -
2013 Poster: Online Robust PCA via Stochastic Optimization »
Jiashi Feng · Huan Xu · Shuicheng Yan -
2013 Poster: Online PCA for Contaminated Data »
Jiashi Feng · Huan Xu · Shie Mannor · Shuicheng Yan