Skip to yearly menu bar Skip to main content

Workshop: Attributing Model Behavior at Scale (ATTRIB)

Better than Balancing: Debiasing through Data Attribution

Saachi Jain · Kimia Hamidieh · Kristian Georgiev · Marzyeh Ghassemi · Aleksander Madry


Spurious correlations in the training data can cause serious problems for machinelearning deployment. However, common debiasing approaches which interveneon the training procedure (e.g., by adjusting the loss) can be especially sensitiveto regularization and hyperparameter selection. In this paper, we advocate for adata-based perspective on model debiasing by directly targeting the root causes ofthe bias within the training data itself. Specifically, we leverage data attributiontechniques to isolate specific examples that disproportionally drive reliance onthe spurious correlation. We find that removing these training examples canefficiently debias the final classifier. Moreover, our method requires no additionalhyperparameters, and does not require group annotations for the training data.

Chat is not available.