Timezone: »

Fair Clustering Using Antidote Data
Anshuman Chhabra · Adish Singla · Prasant Mohapatra

Clustering algorithms are widely utilized for many modern data science applications. This motivates the need to make outputs of clustering algorithms fair. Traditionally, new fair algorithmic variants to clustering algorithms are developed for specific notions of fairness. However, depending on the application context, different definitions of fairness might need to be employed. As a result, new algorithms and analysis need to be proposed for each combination of clustering algorithm and fairness definition. Additionally, each new algorithm would need to be reimplemented for deployment in a real-world system. Hence, we propose an alternate approach to group-level fairness in center-based clustering inspired by research on data poisoning attacks. We seek to augment the original dataset with a small number of data points, called antidote data. When clustering is undertaken on this new dataset, the output is fair, for the chosen clustering algorithm and fairness definition. We formulate this as a general bi-level optimization problem which can accommodate any center-based clustering algorithms and fairness notions. We then categorize approaches for solving this bi-level optimization for two different problem settings. Extensive experiments on different clustering algorithms and fairness notions show that our algorithms can achieve desired levels of fairness on many real-world datasets with a very small percentage of antidote data added. We also find that our algorithms achieve lower fairness costs and competitive clustering performance compared to other state-of-the-art fair clustering algorithms.

Author Information

Anshuman Chhabra (University of California, Davis)
Anshuman Chhabra

Anshuman Chhabra is a Ph.D candidate at the University of California, Davis being advised by Prof. Prasant Mohapatra. Prior to that, he completed his B.Eng in Electronics and Communication Engineering from the University of Delhi, India. His research seeks to improve Machine Learning (ML) models and facilitate their adoption into society by analyzing model robustness along two dimensions: adversarial robustness (adversarial attacks/defenses against models) and social robustness (fair machine learning). His other research interests include designing Machine Learning and Reinforcement Learning based debiasing interventions for social media platforms such as YouTube and Twitter. He received the UC Davis Graduate Student Fellowship in 2018, and has held research positions at ESnet, Lawrence Berkeley National Laboratory, USA (2017), the Max Planck Institute for Software Systems, Germany (2020), and the University of Amsterdam, Netherlands (2022).

Adish Singla (MPI-SWS)
Prasant Mohapatra (University of California, Davis)

More from the Same Authors