Timezone: »

A Contrastive Framework for Neural Text Generation
Yixuan Su · Tian Lan · Yan Wang · Dani Yogatama · Lingpeng Kong · Nigel Collier

Thu Dec 01 02:00 PM -- 04:00 PM (PST) @ Hall J #119

Text generation is of great importance to many natural language processing applications. However, maximization-based decoding methods (e.g., beam search) of neural language models often lead to degenerate solutions---the generated text is unnatural and contains undesirable repetitions. Existing approaches introduce stochasticity via sampling or modify training objectives to decrease the probabilities of certain tokens (e.g., unlikelihood training). However, they often lead to solutions that lack coherence. In this work, we show that an underlying reason for model degeneration is the anisotropic distribution of token representations. We present a contrastive solution: (i) SimCTG, a contrastive training objective to calibrate the model's representation space, and (ii) a decoding method---contrastive search---to encourage diversity while maintaining coherence in the generated text. Extensive experiments and analyses on three benchmarks from two languages demonstrate that our proposed approach outperforms state-of-the-art text generation methods as evaluated by both human and automatic metrics.

Author Information

Yixuan Su (University of Cambridge)
Tian Lan (Tencent Technology (Shenzhen) Co., Ltd.)
Yan Wang (Tencent AI Lab)

Yan Wang is a senior researcher of Natural Language Processing Center, Tencent AI Lab. His research interests include dialogue systems, text generation, and question answering. He has published over 30 research papers in leading conferences and journals, such as ACL, EMNLP, NAACL, AAAI, and TASLP. He received an outstanding paper award in ACL 2021 for one of his works on retrieval-augmented text generation. He served in the program committee of some conferences including ACL, EMNLP, NAACL, AAAI, etc, and was selected as a session chair in ACL 2021 and senior program committee member in AAAI 2022.

Dani Yogatama (Google DeepMind)
Lingpeng Kong (Department of Computer Science, The University of Hong Kong)
Nigel Collier (University of Cambridge)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors