NeurIPS Poster Coresets for Archetypal Analysis

Poster

Coresets for Archetypal Analysis

Sebastian Mair · Ulf Brefeld

East Exhibition Hall B + C #55

Keywords: [ Matrix and Tensor Factorization ] [ Algorithms -> Large Scale Learning; Applications ] [ Unsupervised Learning ] [ Algorithms ]

[ Abstract ]

Abstract:

Archetypal analysis represents instances as linear mixtures of prototypes (the archetypes) that lie on the boundary of the convex hull of the data. Archetypes are thus often better interpretable than factors computed by other matrix factorization techniques. However, the interpretability comes with high computational cost due to additional convexity-preserving constraints. In this paper, we propose efficient coresets for archetypal analysis. Theoretical guarantees are derived by showing that quantization errors of k-means upper bound archetypal analysis; the computation of a provable absolute-coreset can be performed in only two passes over the data. Empirically, we show that the coresets lead to improved performance on several data sets.

Live content is unavailable. Log in and register to view live content