Timezone: »
Deep learning (DL) systems have been gaining popularity in critical tasks such as credit evaluation and crime prediction. Such systems demand fairness. Recent work shows that DL software implementations introduce variance: identical DL training runs (i.e., identical network, data, configuration, software, and hardware) with a fixed seed produce different models. Such variance could make DL models and networks violate fairness compliance laws, resulting in negative social impact. In this paper, we conduct the first empirical study to quantify the impact of software implementation on the fairness and its variance of DL systems. Our study of 22 mitigation techniques and five baselines reveals up to 12.6% fairness variance across identical training runs with identical seeds. In addition, most debiasing algorithms have a negative impact on the model such as reducing model accuracy, increasing fairness variance, or increasing accuracy variance. Our literature survey shows that while fairness is gaining popularity in artificial intelligence (AI) related conferences, only 34.4% of the papers use multiple identical training runs to evaluate their approach, raising concerns about their results’ validity. We call for better fairness evaluation and testing protocols to improve fairness and fairness variance of DL systems as well as DL research validity and reproducibility at large.
Author Information
Shangshu Qian (Purdue University)
Viet Hung Pham (University of Waterloo)
Thibaud Lutellier (University of Waterloo)
Zeou Hu (University of Waterloo)
Jungwon Kim (Purdue University)
Lin Tan (Purdue University)
Yaoliang Yu (Carnegie Mellon University)
Jiahao Chen (JPMorgan AI Research)
Sameena Shah (J.P. Morgan Chase)
More from the Same Authors
-
2022 : Can Querying for Bias Leak Protected Attributes? Achieving Privacy With Smooth Sensitivity »
Faisal Hamman · Jiahao Chen · Sanghamitra Dutta -
2021 Poster: Demystifying and Generalizing BinaryConnect »
Tim Dockhorn · Yaoliang Yu · Eyyüb Sari · Mahdi Zolnouri · Vahid Partovi Nia -
2021 Poster: Quantifying and Improving Transferability in Domain Generalization »
Guojun Zhang · Han Zhao · Yaoliang Yu · Pascal Poupart -
2021 Poster: S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks »
Xinlin Li · Bang Liu · Yaoliang Yu · Wulong Liu · Chunjing XU · Vahid Partovi Nia -
2020 : Invited Talk 9: Building Compliant Models: Fair Feature Selection with Multiobjective Monte Carlo Tree Search »
Jiahao Chen -
2019 Workshop: Robust AI in Financial Services: Data, Fairness, Explainability, Trustworthiness, and Privacy »
Alina Oprea · Avigdor Gal · Eren K. · Isabelle Moulinier · Jiahao Chen · Manuela Veloso · Senthil Kumar · Tanveer Faruquie -
2019 : Opening Remarks »
Jiahao Chen · Manuela Veloso · Senthil Kumar · Isabelle Moulinier · Avigdor Gal · Alina Oprea · Tanveer Faruquie · Eren K. -
2018 : Panel: Explainability, Fairness and Human Aspects in Financial Services »
Madeleine Udell · Jiahao Chen · Nitzan Mekel-Bobrov · Manuela Veloso · Jon Kleinberg · Andrea Freeman · Samik Chandarana · Jacob Sisk · Michael McBurnett -
2018 Workshop: Challenges and Opportunities for AI in Financial Services: the Impact of Fairness, Explainability, Accuracy, and Privacy »
Manuela Veloso · Nathan Kallus · Sameena Shah · Senthil Kumar · Isabelle Moulinier · Jiahao Chen · John Paisley -
2016 Poster: Convex Two-Layer Modeling with Latent Structure »
Vignesh Ganapathiraman · Xinhua Zhang · Yaoliang Yu · Junfeng Wen -
2014 Poster: Efficient Structured Matrix Rank Minimization »
Adams Wei Yu · Wanli Ma · Yaoliang Yu · Jaime Carbonell · Suvrit Sra