Timezone: »

Improving the Robustness of Conditional Language Models by Detecting and Removing Input Noise
Kundan Krishna · Yao Zhao · Jie Ren · Balaji Lakshminarayanan · Jiaming Luo · Mohammad Saleh · Peter Liu

The evaluation of conditional language modeling tasks such as abstractive summarization typically uses test data that is identically distributed as training. In real-world practice, documents to be summarized may contain input noise caused by text extraction artifacts or data pipeline bugs. The robustness of model performance under distribution shift caused by such noise is relatively under-studied. We present a large empirical study quantifying the sometimes severe loss in performance (up to 12 ROUGE-1 points) from different types of input noise for a range of datasets and model sizes. We then propose a light-weight method for detecting and removing such noise in the input during model inference without requiring any extra training or auxiliary models, which effectively mitigates the loss in performance, recovering up to 11 ROUGE-1 points.

Author Information

Kundan Krishna (Carnegie Mellon University)
Yao Zhao (Google)
Jie Ren (Google Inc.)
Balaji Lakshminarayanan (Google Brain)

Balaji Lakshminarayanan is a research scientist at Google Brain. Prior to that, he was a research scientist at DeepMind. He received his PhD from the Gatsby Unit, University College London where he worked with Yee Whye Teh. His recent research has focused on probabilistic deep learning, specifically, uncertainty estimation, out-of-distribution robustness and deep generative models. Notable contributions relevant to the tutorial include developing state-of-the-art methods for calibration under dataset shift (such as deep ensembles and AugMix) and showing that deep generative models do not always know what they don't know. He has co-organized several workshops on "Uncertainty and Robustness in deep learning" and served as Area Chair for NeurIPS, ICML, ICLR and AISTATS.

Jiaming Luo (Google)
Mohammad Saleh (Google)
Peter Liu (Google Research, Brain)

More from the Same Authors