Timezone: »
Sketching is a powerful dimensionality reduction tool for accelerating statistical learning algorithms. However, its applicability has been limited to a certain extent since the crucial ingredient, the so-called oblivious subspace embedding, can only be applied to data spaces with an explicit representation as the column span or row span of a matrix, while in many settings learning is done in a high-dimensional space implicitly defined by the data matrix via a kernel transformation. We propose the first {\em fast} oblivious subspace embeddings that are able to embed a space induced by a non-linear kernel {\em without} explicitly mapping the data to the high-dimensional space. In particular, we propose an embedding for mappings induced by the polynomial kernel. Using the subspace embeddings, we obtain the fastest known algorithms for computing an implicit low rank approximation of the higher-dimension mapping of the data matrix, and for computing an approximate kernel PCA of the data, as well as doing approximate kernel principal component regression.
Author Information
Haim Avron (Tel Aviv University)
Huy Nguyen (Northeastern University)
David Woodruff (IBM Research)
More from the Same Authors
-
2023 Poster: Near Optimal Reconstruction of Spherical Harmonic Expansions »
Amir Zandieh · Insu Han · Haim Avron -
2017 Poster: Decomposable Submodular Function Minimization: Discrete and Continuous »
Alina Ene · Huy Nguyen · László A. Végh -
2017 Spotlight: Decomposable Submodular Function Minimization: Discrete and Continuous »
Alina Ene · Huy Nguyen · László A. Végh -
2016 Poster: Sublinear Time Orthogonal Tensor Decomposition »
Zhao Song · David Woodruff · Huan Zhang -
2016 Poster: Communication-Optimal Distributed Clustering »
Jiecao Chen · He Sun · David Woodruff · Qin Zhang -
2015 : Sketching as a tool for numerical linear algebra »
David Woodruff -
2014 Poster: Improved Distributed Principal Component Analysis »
Yingyu Liang · Maria-Florina F Balcan · Vandana Kanchanapally · David Woodruff -
2014 Poster: On Communication Cost of Distributed Statistical Estimation and Dimensionality »
Ankit Garg · Tengyu Ma · Huy Nguyen -
2014 Poster: Low Rank Approximation Lower Bounds in Row-Update Streams »
David Woodruff -
2014 Oral: On Communication Cost of Distributed Statistical Estimation and Dimensionality »
Ankit Garg · Tengyu Ma · Huy Nguyen -
2013 Workshop: Large Scale Matrix Analysis and Inference »
Reza Zadeh · Gunnar Carlsson · Michael Mahoney · Manfred K. Warmuth · Wouter M Koolen · Nati Srebro · Satyen Kale · Malik Magdon-Ismail · Ashish Goel · Matei A Zaharia · David Woodruff · Ioannis Koutis · Benjamin Recht -
2013 Poster: Sketching Structured Matrices for Faster Nonlinear Regression »
Haim Avron · Vikas Sindhwani · David Woodruff