Poster
Do Counterfactually Fair Image Classifiers Satisfy Group Fairness? -- A Theoretical and Empirical Study
Sangwon Jung · Sumin Yu · Sanghyuk Chun · Taesup Moon
West Ballroom A-D #5506
[
Abstract
]
Wed 11 Dec 4:30 p.m. PST
— 7:30 p.m. PST
Abstract:
The notion of algorithmic fairness has been actively explored from various aspects of fairness, such as counterfactual fairness (CF) and group fairness (GF). The relationship between CF and GF remains an undiscovered problem, especially in image classification tasks; we often cannot collect counterfactual samples from the existing images (e.g., a photo of the same person but with a different gender). In this paper, we construct new image datasets for evaluating CF using a high-quality image editing method and carefully labeling by human annotators. Our datasets, CelebA-CF and LFW-CF, build upon the popular image GF benchmarks; hence, we can evaluate CF and GF simultaneously. We empirically observe that CF does not imply GF in image classification, whereas studies on tabular datasets observed the opposite. We theoretically show that it can happen when a latent attribute $G$ correlated with, but not caused by, the sensitive attribute (e.g., males usually have shorter hair than females), exists. From this observation, we propose a simple baseline Counterfactual Knowledge Distillation (CKD) to mitigate the problem. Extensive experimental results on CelebA-CF and LFW-CF demonstrate that CF-achieving models satisfy GF if we successfully reduce the reliance to $G$ (e.g., using CKD). Code and datasets will be publically available upon acceptance.
Live content is unavailable. Log in and register to view live content