Timezone: »

Finding Exemplars from Pairwise Dissimilarities via Simultaneous Sparse Recovery
Ehsan Elhamifar · Guillermo Sapiro · René Vidal

Tue Dec 04 07:00 PM -- 12:00 AM (PST) @ Harrah’s Special Events Center 2nd Floor #None

Given pairwise dissimilarities between data points, we consider the problem of finding a subset of data points called representatives or exemplars that can efficiently describe the data collection. We formulate the problem as a row-sparsity regularized trace minimization problem which can be solved efficiently using convex programming. The solution of the proposed optimization program finds the representatives and the probability that each data point is associated to each one of the representatives. We obtain the range of the regularization parameter for which the solution of the proposed optimization program changes from selecting one representative to selecting all data points as the representatives. When data points are distributed around multiple clusters according to the dissimilarities, we show that the data in each cluster select only representatives from that cluster. Unlike metric-based methods, our algorithm does not require that the pairwise dissimilarities be metrics and can be applied to dissimilarities that are asymmetric or violate the triangle inequality. We demonstrate the effectiveness of the proposed algorithm on synthetic data as well as real-world datasets of images and text.

Author Information

Ehsan Elhamifar (UC Berkeley)
Guillermo Sapiro (Duke University)
René Vidal (Mathematical Institute for Data Science Johns Hopkins University)

More from the Same Authors