Skip to yearly menu bar Skip to main content


Poster

Unsupervised Feature Selection for the k-means Clustering Problem

Christos Boutsidis · Michael W Mahoney · Petros Drineas


Abstract: We present a novel feature selection algorithm for the k-means clustering problem. Our algorithm is randomized and, assuming an accuracy parameter ϵ(0,1), selects and appropriately rescales in an unsupervised manner Θ(klog(k/ϵ)/ϵ2) features from a dataset of arbitrary dimensions. We prove that, if we run any γ-approximate k-means algorithm (γ1) on the features selected using our method, we can find a (1+(1+ϵ)γ)-approximate partition with high probability.

Live content is unavailable. Log in and register to view live content