Timezone: »

 
Distribution Preserving Bayesian Coresets using Set Constraints
Shovik Guha · Rajiv Khanna · Sanmi Koyejo
Event URL: https://openreview.net/forum?id=dQscXLzPcbR »

Bayesian coresets have become of increasing interest recently for providing a theoretically sound, scalable approach to Bayesian inference. In brief, a coreset is a (weighted) subsample sample of a dataset that approximates the original dataset under some metric. Bayesian coresets specifically focus on approximations that approximate the posterior distribution. Unfortunately, existing Bayesian coreset approaches can significantly undersample minority subpopulations, leading to a lack of distributional robustness. As a remedy, this work extends existing Bayesian coresets from enforcing sparsity constraints to group-wise sparsity constraints. We explore how this approach helps to mitigate distributional vulnerability. We further generalize the group constraints to Bayesian coresets with matroid constraints, which may be of independent interest. We present an optimization analysis of the proposed approach, along with an empirical evaluation on benchmark datasets that support our claims.

Author Information

Shovik Guha (University of Illinois, Urbana Champaign)
Rajiv Khanna (University of California, Berkeley)
Sanmi Koyejo (University of Illinois at Urbana-Champaign & Google Research)
Sanmi Koyejo

Sanmi Koyejo is an Assistant Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign and a research scientist at Google AI in Accra. Koyejo's research interests are in developing the principles and practice of adaptive and robust machine learning. Additionally, Koyejo focuses on applications to biomedical imaging and neuroscience. Koyejo co-founded the Black in AI organization and currently serves on its board.

More from the Same Authors