Timezone: »

Perturbation Augmentation for Fairer NLP
Rebecca Qian · Candace Ross · Jude Fernandes · Eric Michael Smith · Douwe Kiela · Adina Williams
Event URL: https://openreview.net/forum?id=8KRDmIlRF6 »

Unwanted and often harmful social biases are becoming ever more salient in NLP research, affecting both models and datasets. In this work, we ask whether training on demographically perturbed data leads to fairer language models. We collect a large dataset of human annotated text perturbations and train a neural perturbation model, which we show outperforms heuristic alternatives. We find that (i) language models (LMs) pre-trained on demographically perturbed corpora are typically more fair, and (ii) LMs finetuned on perturbed GLUE datasets exhibit less demographic bias on downstream tasks, and (iii) fairness improvements do not come at the expense of performance on downstream tasks. Lastly, we discuss outstanding questions about how best to evaluate the (un)fairness of large language models. We hope that this exploration of neural demographic perturbation will help drive more improvement towards fairer NLP.

Author Information

Rebecca Qian (Facebook)
Candace Ross (Facebook AI)
Jude Fernandes (FAIR)
Eric Michael Smith (Meta AI)

Eric is a research engineer at Meta AI, focusing on algorithmic bias in language models and chatbot evaluation. Prior to Meta AI, Eric was a machine learning engineer at Blue Apron, creating and maintaining demand forecast models. Eric was a fellow of Insight Data Science and holds a doctorate in physics from Princeton University for biophysics research in precision measurements of gene expression in the fruit fly embryo.

Douwe Kiela (Hugging Face)
Adina Williams (Facebook AI Research)

More from the Same Authors