Timezone: »

Fair Clustering Under a Bounded Cost
Seyed Esmaeili · Brian Brubach · Aravind Srinivasan · John Dickerson

Thu Dec 09 08:30 AM -- 10:00 AM (PST) @

Clustering is a fundamental unsupervised learning problem where a dataset is partitioned into clusters that consist of nearby points in a metric space. A recent variant, fair clustering, associates a color with each point representing its group membership and requires that each color has (approximately) equal representation in each cluster to satisfy group fairness. In this model, the cost of the clustering objective increases due to enforcing fairness in the algorithm. The relative increase in the cost, the ```````''price of fairness,'' can indeed be unbounded. Therefore, in this paper we propose to treat an upper bound on the clustering objective as a constraint on the clustering problem, and to maximize equality of representation subject to it. We consider two fairness objectives: the group utilitarian objective and the group egalitarian objective, as well as the group leximin objective which generalizes the group egalitarian objective. We derive fundamental lower bounds on the approximation of the utilitarian and egalitarian objectives and introduce algorithms with provable guarantees for them. For the leximin objective we introduce an effective heuristic algorithm. We further derive impossibility results for other natural fairness objectives. We conclude with experimental results on real-world datasets that demonstrate the validity of our algorithms.

Author Information

Seyed Esmaeili (University of Maryland, College Park)
Brian Brubach (Wellesley College)
Aravind Srinivasan (University of Maryland College Park)

Aravind Srinivasan is a Professor with the Department of Computer Science and UMIACS at UMD, College Park. He received his undergraduate degree from the Indian Institute of Technology, Madras, and his Ph.D. from Cornell University. He was a postdoctoral researcher at the Institute for Advanced Study in Princeton and at DIMACS. Srinivasan's research interests are in randomized algorithms, E-commerce, public health, networking, social networks, and combinatorial optimization, as well as in the growing confluence of algorithms, networks, and randomness, in fields including the social Web, machine learning, biology, and energy. He has published more than 115 papers in these areas, in fora including Nature, Journal of the ACM, IEEE/ACM Transactions on Networking, and the SIAM Journal on Computing. He is Editor-in-Chief of the ACM Transactions on Algorithms, Managing Editor for Theory of Computing, and Associate Editor of Networks. Srinivasan is a Fellow of four professional societies: ACM, AAAS, IEEE and EATCS. He received a Distinguished Alumnus Award from his alma mater IIT Madras. He also received the Distinguished Faculty Award from the Board of Visitors of the College of Computing, Mathematical, and Natural Sciences (University of Maryland) in 2016. He received a Data Science Research Award from Adobe, Inc. in 2017.

John Dickerson (University of Maryland)

More from the Same Authors