Timezone: »
Pretrained language models (PLMs) have demonstrated remarkable performance in various natural language processing tasks: Unidirectional PLMs (e.g., GPT) are well known for their superior text generation capabilities; bidirectional PLMs (e.g., BERT) have been the prominent choice for natural language understanding (NLU) tasks. While both types of models have achieved promising few-shot learning performance, their potential for zero-shot learning has been underexplored. In this paper, we present a simple approach that uses both types of PLMs for fully zero-shot learning of NLU tasks without requiring any task-specific data: A unidirectional PLM generates class-conditioned texts guided by prompts, which are used as the training data for fine-tuning a bidirectional PLM. With quality training data selected based on the generation probability and regularization techniques (label smoothing and temporal ensembling) applied to the fine-tuning stage for better generalization and stability, our approach demonstrates strong performance across seven classification tasks of the GLUE benchmark (e.g., 72.3/73.8 on MNLI-m/mm and 92.8 on SST-2), significantly outperforming zero-shot prompting methods and achieving even comparable results to strong few-shot approaches using 32 training samples per class.
Author Information
Yu Meng (University of Illinois at Urbana-Champaign)
Jiaxin Huang (University of Illinois Urbana-Champaign)
Yu Zhang (University of Illinois, Urbana Champaign)
Jiawei Han (University of Illinois at Urbana-Champaign)
More from the Same Authors
-
2022 : Shift-Robust Node Classification via Graph Clustering Co-training »
Qi Zhu · Chao Zhang · Chanyoung Park · Carl Yang · Jiawei Han -
2021 Poster: Universal Graph Convolutional Networks »
Di Jin · Zhizhi Yu · Cuiying Huo · Rui Wang · Xiao Wang · Dongxiao He · Jiawei Han -
2021 Poster: Shift-Robust GNNs: Overcoming the Limitations of Localized Graph Training data »
Qi Zhu · Natalia Ponomareva · Jiawei Han · Bryan Perozzi -
2021 Poster: Transfer Learning of Graph Neural Networks with Ego-graph Information Maximization »
Qi Zhu · Carl Yang · Yidan Xu · Haonan Wang · Chao Zhang · Jiawei Han -
2021 Poster: COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining »
Yu Meng · Chenyan Xiong · Payal Bajaj · saurabh tiwary · Paul Bennett · Jiawei Han · XIA SONG -
2019 Poster: Spherical Text Embedding »
Yu Meng · Jiaxin Huang · Guangyuan Wang · Chao Zhang · Honglei Zhuang · Lance Kaplan · Jiawei Han -
2014 Poster: Robust Tensor Decomposition with Gross Corruption »
Quanquan Gu · Huan Gui · Jiawei Han -
2012 Poster: Selective Labeling via Error Bound Minimization »
Quanquan Gu · Tong Zhang · Chris Ding · Jiawei Han -
2009 Poster: Graph-based Consensus Maximization among Multiple Supervised and Unsupervised Models »
Jing Gao · Feng Liang · Wei Fan · Yizhou Sun · Jiawei Han