Timezone: »

Randomized Sketches for Clustering: Fast and Optimal Kernel $k$-Means
Rong Yin · Yong Liu · Weiping Wang · Dan Meng

Wed Nov 30 09:00 AM -- 11:00 AM (PST) @ Hall J #538
Kernel $k$-means is arguably one of the most common approaches to clustering. In this paper, we investigate the efficiency of kernel $k$-means combined with randomized sketches in terms of both statistical analysis and computational requirements. More precisely, we propose a unified randomized sketches framework to kernel $k$-means and investigate its excess risk bounds, obtaining the state-of-the-art risk bound with only a fraction of computations. Indeed, we prove that it suffices to choose the sketch dimension $\Omega(\sqrt{n})$ to obtain the same accuracy of exact kernel $k$-means with greatly reducing the computational costs, for sub-Gaussian sketches, the randomized orthogonal system (ROS) sketches, and Nystr\"{o}m kernel $k$-means, where $n$ is the number of samples. To the best of our knowledge, this is the first result of this kind for unsupervised learning. Finally, the numerical experiments on simulated data and real-world datasets validate our theoretical analysis.

Author Information

Rong Yin (Institute of Information Engineering, Chinese Academy of Sciences)
Yong Liu (Renmin University of China)
Weiping Wang (Institute of Information Engineering, CAS, China)
Dan Meng (Institute of information engineering, Chinese Academy of Sciences)

More from the Same Authors