Timezone: »
We present a new framework for semi-supervised learning with sparse eigenfunction bases of kernel matrices. It turns out that when the \emph{cluster assumption} holds, that is, when the high density regions are sufficiently separated by low density valleys, each high density area corresponds to a unique representative eigenvector. Linear combination of such eigenvectors (or, more precisely, of their Nystrom extensions) provide good candidates for good classification functions. By first choosing an appropriate basis of these eigenvectors from unlabeled data and then using labeled data with Lasso to select a classifier in the span of these eigenvectors, we obtain a classifier, which has a very sparse representation in this basis. Importantly, the sparsity appears naturally from the cluster assumption. Experimental results on a number of real-world data-sets show that our method is competitive with the state of the art semi-supervised learning algorithms and outperforms the natural base-line algorithm (Lasso in the Kernel PCA basis).
Author Information
Kaushik Sinha (The Ohio State University)
Mikhail Belkin (Ohio State University)
More from the Same Authors
-
2021 Poster: Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures »
Yuan Cao · Quanquan Gu · Mikhail Belkin -
2021 Poster: Multiple Descent: Design Your Own Generalization Curve »
Lin Chen · Yifei Min · Mikhail Belkin · Amin Karbasi -
2018 Poster: Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate »
Mikhail Belkin · Daniel Hsu · Partha P Mitra -
2017 Poster: Diving into the shallows: a computational perspective on large-scale shallow learning »
SIYUAN MA · Mikhail Belkin -
2017 Spotlight: Diving into the shallows: a computational perspective on large-scale shallow learning »
SIYUAN MA · Mikhail Belkin -
2016 Poster: Graphons, mergeons, and so on! »
Justin Eldridge · Mikhail Belkin · Yusu Wang -
2016 Oral: Graphons, mergeons, and so on! »
Justin Eldridge · Mikhail Belkin · Yusu Wang -
2016 Poster: Clustering with Bregman Divergences: an Asymptotic Analysis »
Chaoyue Liu · Mikhail Belkin -
2015 Poster: A Pseudo-Euclidean Iteration for Optimal Recovery in Noisy ICA »
James R Voss · Mikhail Belkin · Luis Rademacher -
2014 Poster: Learning with Fredholm Kernels »
Qichao Que · Mikhail Belkin · Yusu Wang -
2013 Workshop: Modern Nonparametric Methods in Machine Learning »
Arthur Gretton · Mladen Kolar · Samory Kpotufe · John Lafferty · Han Liu · Bernhard Schölkopf · Alexander Smola · Rob Nowak · Mikhail Belkin · Lorenzo Rosasco · peter bickel · Yue Zhao -
2013 Poster: Inverse Density as an Inverse Problem: the Fredholm Equation Approach »
Qichao Que · Mikhail Belkin -
2013 Poster: Fast Algorithms for Gaussian Noise Invariant Independent Component Analysis »
James R Voss · Luis Rademacher · Mikhail Belkin -
2013 Spotlight: Inverse Density as an Inverse Problem: the Fredholm Equation Approach »
Qichao Que · Mikhail Belkin -
2012 Poster: Near-optimal Differentially Private Principal Components »
Kamalika Chaudhuri · Anand D Sarwate · Kaushik Sinha -
2011 Poster: Data Skeletonization via Reeb Graphs »
Xiaoyin Ge · Issam I Safa · Mikhail Belkin · Yusu Wang -
2007 Spotlight: The Value of Labeled and Unlabeled Examples when the Model is Imperfect »
Kaushik Sinha · Mikhail Belkin -
2007 Poster: The Value of Labeled and Unlabeled Examples when the Model is Imperfect »
Kaushik Sinha · Mikhail Belkin -
2006 Poster: On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts »
Hariharan Narayanan · Mikhail Belkin · Partha Niyogi -
2006 Poster: Convergence of Laplacian Eigenmaps »
Mikhail Belkin · Partha Niyogi