Timezone: »
Unwanted and often harmful social biases are becoming ever more salient in NLP research, affecting both models and datasets. In this work, we ask whether training on demographically perturbed data leads to fairer language models. We collect a large dataset of human annotated text perturbations and train a neural perturbation model, which we show outperforms heuristic alternatives. We find that (i) language models (LMs) pre-trained on demographically perturbed corpora are typically more fair, and (ii) LMs finetuned on perturbed GLUE datasets exhibit less demographic bias on downstream tasks, and (iii) fairness improvements do not come at the expense of performance on downstream tasks. Lastly, we discuss outstanding questions about how best to evaluate the (un)fairness of large language models. We hope that this exploration of neural demographic perturbation will help drive more improvement towards fairer NLP.
Author Information
Rebecca Qian (Facebook)
Candace Ross (Facebook AI)
Jude Fernandes (FAIR)
Eric Michael Smith (Meta AI)
Eric is a research engineer at Meta AI, focusing on algorithmic bias in language models and chatbot evaluation. Prior to Meta AI, Eric was a machine learning engineer at Blue Apron, creating and maintaining demand forecast models. Eric was a fellow of Insight Data Science and holds a doctorate in physics from Princeton University for biophysics research in precision measurements of gene expression in the fruit fly embryo.
Douwe Kiela (Hugging Face)
Adina Williams (Facebook AI Research)
More from the Same Authors
-
2022 Workshop: Human Evaluation of Generative Models »
Divyansh Kaushik · Jennifer Hsia · Jessica Huynh · Yonadav Shavit · Samuel Bowman · Ting-Hao Huang · Douwe Kiela · Zachary Lipton · Eric Michael Smith -
2021 : Facebook - Data Centric Infrastructure »
Douwe Kiela -
2021 Demonstration: Demonstrations 4 »
Douwe Kiela · Barbara Caputo · Marco Ciccone -
2021 Competition: Competition Track Day 4: Overviews + Breakout Sessions »
Douwe Kiela · Marco Ciccone · Barbara Caputo -
2021 Poster: True Few-Shot Learning with Language Models »
Ethan Perez · Douwe Kiela · Kyunghyun Cho -
2021 : Invited talk - Douwe Kiela »
Douwe Kiela -
2021 Competition: Competition Track Day 3: Overviews + Breakout Sessions »
Douwe Kiela · Marco Ciccone · Barbara Caputo -
2021 Demonstration: Demonstrations 3 »
Douwe Kiela · Barbara Caputo · Marco Ciccone -
2021 Poster: Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking »
Zhiyi Ma · Kawin Ethayarajh · Tristan Thrush · Somya Jain · Ledell Wu · Robin Jia · Christopher Potts · Adina Williams · Douwe Kiela -
2021 Demonstration: Demonstrations 2 »
Douwe Kiela · Barbara Caputo · Marco Ciccone -
2021 : Intro »
Douwe Kiela -
2021 Competition: Competition Track Day 2: Overviews + Breakout Sessions »
Douwe Kiela · Marco Ciccone · Barbara Caputo -
2021 Competition: Competition Track Day 1: Overviews + Breakout Sessions »
Douwe Kiela · Marco Ciccone · Barbara Caputo -
2021 : Introduction Competion Day 1 »
Douwe Kiela -
2021 Poster: Human-Adversarial Visual Question Answering »
Sasha Sheng · Amanpreet Singh · Vedanuj Goswami · Jose Magana · Tristan Thrush · Wojciech Galuba · Devi Parikh · Douwe Kiela -
2021 Demonstration: Demonstrations 1 »
Douwe Kiela · Barbara Caputo · Marco Ciccone -
2021 : Introduction »
Douwe Kiela -
2020 : Q & A and Panel Session with Dan Weld, Kristen Grauman, Scott Yih, Emma Brunskill, and Alex Ratner »
Kristen Grauman · Wen-tau Yih · Alexander Ratner · Emma Brunskill · Douwe Kiela · Daniel S. Weld -
2020 Workshop: HAMLETS: Human And Model in the Loop Evaluation and Training Strategies »
Divyansh Kaushik · Bhargavi Paranjape · Forough Arabshahi · Yanai Elazar · Yixin Nie · Max Bartolo · Polina Kirichenko · Pontus Lars Erik Saito Stenetorp · Mohit Bansal · Zachary Lipton · Douwe Kiela -
2020 : Opening Remarks »
Divyansh Kaushik · Bhargavi Paranjape · Douwe Kiela -
2020 : The Hateful Memes Challenge: Live award ceremony and winner presentations »
Douwe Kiela -
2020 : The Hateful Memes Challenge: Competition Overview »
Douwe Kiela -
2020 Poster: The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes »
Douwe Kiela · Hamed Firooz · Aravind Mohan · Vedanuj Goswami · Amanpreet Singh · Pratik Ringshia · Davide Testuggine -
2020 Poster: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks »
Patrick Lewis · Ethan Perez · Aleksandra Piktus · Fabio Petroni · Vladimir Karpukhin · Naman Goyal · Heinrich Küttler · Mike Lewis · Wen-tau Yih · Tim Rocktäschel · Sebastian Riedel · Douwe Kiela -
2020 Poster: Learning Optimal Representations with the Decodable Information Bottleneck »
Yann Dubois · Douwe Kiela · David Schwab · Ramakrishna Vedantam -
2020 Spotlight: Learning Optimal Representations with the Decodable Information Bottleneck »
Yann Dubois · Douwe Kiela · David Schwab · Ramakrishna Vedantam -
2019 : Audrey Durand, Douwe Kiela, Kamalika Chaudhuri moderated by Yann Dauphin »
Audrey Durand · Kamalika Chaudhuri · Yann Dauphin · Orhan Firat · Dilan Gorur · Douwe Kiela -
2019 : Douwe Kiela - Benchmarking Progress in AI: A New Benchmark for Natural Language Understanding »
Douwe Kiela -
2019 Workshop: Emergent Communication: Towards Natural Language »
Abhinav Gupta · Michael Noukhovitch · Cinjon Resnick · Natasha Jaques · Angelos Filos · Marie Ossenkopf · Angeliki Lazaridou · Jakob Foerster · Ryan Lowe · Douwe Kiela · Kyunghyun Cho -
2019 : Poster session »
Candace Ross · Yassine Mrabet · Sanjay Subramanian · Geoffrey Cideron · Jesse Mu · Suvrat Bhooshan · Eda Okur Kavil · Jean-Benoit Delbrouck · Yen-Ling Kuo · Nicolas Lair · Gabriel Ilharco · T.S. Jayram · Alba María Herrera Palacio · Chihiro Fujiyama · Olivier Tieleman · Anna Potapenko · Guan-Lin Chao · Thomas Sutter · Olga Kovaleva · Farley Lai · Xin Wang · Vasu Sharma · Catalina Cangea · Nikhil Krishnaswamy · Yuta Tsuboi · Alexander Kuhnle · Khanh Nguyen · Dian Yu · Homagni Saha · Jiannan Xiang · Vijay Venkataraman · Ankita Kalra · Ning Xie · Derek Doran · Travis Goodwin · Asim Kadav · Shabnam Daghaghi · Jason Baldridge · Jialin Wu · Jingxiang Lin · Unnat Jain -
2019 Poster: Hyperbolic Graph Neural Networks »
Qi Liu · Maximilian Nickel · Douwe Kiela -
2018 Workshop: Emergent Communication Workshop »
Jakob Foerster · Angeliki Lazaridou · Ryan Lowe · Igor Mordatch · Douwe Kiela · Kyunghyun Cho -
2018 : Panel Discussion »
Antonio Torralba · Douwe Kiela · Barbara Landau · Angeliki Lazaridou · Joyce Chai · Christopher Manning · Stevan Harnad · Roozbeh Mottaghi -
2018 : Douwe Kiela - Learning Multimodal Embeddings »
Douwe Kiela -
2017 Workshop: Emergent Communication Workshop »
Jakob Foerster · Igor Mordatch · Angeliki Lazaridou · Kyunghyun Cho · Douwe Kiela · Pieter Abbeel -
2017 Poster: Poincaré Embeddings for Learning Hierarchical Representations »
Maximilian Nickel · Douwe Kiela -
2017 Spotlight: Poincaré Embeddings for Learning Hierarchical Representations »
Maximilian Nickel · Douwe Kiela