Timezone: »
Sketching and stochastic gradient methods are arguably the most common techniques to derive efficient large scale learning algorithms. In this paper, we investigate their application in the context of nonparametric statistical learning. More precisely, we study the estimator defined by stochastic gradient with mini batches and random features. The latter can be seen as form of nonlinear sketching and used to define approximate kernel methods. The considered estimator is not explicitly penalized/constrained and regularization is implicit. Indeed, our study highlights how different parameters, such as number of features, iterations, step-size and mini-batch size control the learning properties of the solutions. We do this by deriving optimal finite sample bounds, under standard assumptions. The obtained results are corroborated and illustrated by numerical experiments.
Author Information
Luigi Carratino (University of Genoa)
Alessandro Rudi (INRIA, Ecole Normale Superieure)
Lorenzo Rosasco (University of Genova- MIT - IIT)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Poster: Learning with SGD and Random Features »
Thu. Dec 6th 03:45 -- 05:45 PM Room Room 517 AB #127
More from the Same Authors
-
2021 Spotlight: Mixability made efficient: Fast online multiclass logistic regression »
Rémi Jézéquel · Pierre Gaillard · Alessandro Rudi -
2021 Spotlight: Beyond Tikhonov: faster learning with self-concordant losses, via iterative regularization »
Gaspard Beugnot · Julien Mairal · Alessandro Rudi -
2022 : Scalable Causal Discovery with Score Matching »
Francesco Montagna · Nicoletta Noceti · Lorenzo Rosasco · Kun Zhang · Francesco Locatello -
2022 Poster: Learning Dynamical Systems via Koopman Operator Regression in Reproducing Kernel Hilbert Spaces »
Vladimir Kostic · Pietro Novelli · Andreas Maurer · Carlo Ciliberto · Lorenzo Rosasco · Massimiliano Pontil -
2022 Poster: Active Labeling: Streaming Stochastic Gradients »
Vivien Cabannes · Francis Bach · Vianney Perchet · Alessandro Rudi -
2021 Poster: ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions »
Luigi Carratino · Stefano Vigogna · Daniele Calandriello · Lorenzo Rosasco -
2021 Poster: Mixability made efficient: Fast online multiclass logistic regression »
Rémi Jézéquel · Pierre Gaillard · Alessandro Rudi -
2021 Poster: Overcoming the curse of dimensionality with Laplacian regularization in semi-supervised learning »
Vivien Cabannes · Loucas Pillaud-Vivien · Francis Bach · Alessandro Rudi -
2021 Poster: PSD Representations for Effective Probability Models »
Alessandro Rudi · Carlo Ciliberto -
2021 Poster: Beyond Tikhonov: faster learning with self-concordant losses, via iterative regularization »
Gaspard Beugnot · Julien Mairal · Alessandro Rudi -
2020 Poster: Non-parametric Models for Non-negative Functions »
Ulysse Marteau-Ferey · Francis Bach · Alessandro Rudi -
2020 Spotlight: Non-parametric Models for Non-negative Functions »
Ulysse Marteau-Ferey · Francis Bach · Alessandro Rudi -
2020 Poster: Kernel Methods Through the Roof: Handling Billions of Points Efficiently »
Giacomo Meanti · Luigi Carratino · Lorenzo Rosasco · Alessandro Rudi -
2020 Oral: Kernel Methods Through the Roof: Handling Billions of Points Efficiently »
Giacomo Meanti · Luigi Carratino · Lorenzo Rosasco · Alessandro Rudi -
2019 Poster: Implicit Regularization of Accelerated Methods in Hilbert Spaces »
Nicolò Pagliana · Lorenzo Rosasco -
2019 Poster: Beating SGD Saturation with Tail-Averaging and Minibatching »
Nicole Muecke · Gergely Neu · Lorenzo Rosasco -
2019 Poster: Massively scalable Sinkhorn distances via the Nyström method »
Jason Altschuler · Francis Bach · Alessandro Rudi · Jonathan Niles-Weed -
2019 Poster: Localized Structured Prediction »
Carlo Ciliberto · Francis Bach · Alessandro Rudi -
2019 Poster: Globally Convergent Newton Methods for Ill-conditioned Generalized Self-concordant Losses »
Ulysse Marteau-Ferey · Francis Bach · Alessandro Rudi -
2019 Poster: Efficient online learning with kernels for adversarial large scale problems »
Rémi Jézéquel · Pierre Gaillard · Alessandro Rudi -
2018 Poster: On Fast Leverage Score Sampling and Optimal Learning »
Alessandro Rudi · Daniele Calandriello · Luigi Carratino · Lorenzo Rosasco -
2018 Poster: Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes »
Loucas Pillaud-Vivien · Alessandro Rudi · Francis Bach -
2018 Poster: Statistical and Computational Trade-Offs in Kernel K-Means »
Daniele Calandriello · Lorenzo Rosasco -
2018 Spotlight: Statistical and Computational Trade-Offs in Kernel K-Means »
Daniele Calandriello · Lorenzo Rosasco -
2018 Poster: Dirichlet-based Gaussian Processes for Large-scale Calibrated Classification »
Dimitrios Milios · Raffaello Camoriano · Pietro Michiardi · Lorenzo Rosasco · Maurizio Filippone -
2018 Poster: Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance »
Giulia Luise · Alessandro Rudi · Massimiliano Pontil · Carlo Ciliberto -
2018 Poster: Manifold Structured Prediction »
Alessandro Rudi · Carlo Ciliberto · Gian Maria Marconi · Lorenzo Rosasco -
2017 Poster: Generalization Properties of Learning with Random Features »
Alessandro Rudi · Lorenzo Rosasco -
2017 Oral: Generalization Properties of Learning with Random Features »
Alessandro Rudi · Lorenzo Rosasco -
2017 Poster: Consistent Multitask Learning with Nonlinear Output Relations »
Carlo Ciliberto · Alessandro Rudi · Lorenzo Rosasco · Massimiliano Pontil -
2017 Poster: FALKON: An Optimal Large Scale Kernel Method »
Alessandro Rudi · Luigi Carratino · Lorenzo Rosasco -
2016 Poster: A Consistent Regularization Approach for Structured Prediction »
Carlo Ciliberto · Lorenzo Rosasco · Alessandro Rudi -
2016 Poster: Optimal Learning for Multi-pass Stochastic Gradient Methods »
Junhong Lin · Lorenzo Rosasco -
2015 Poster: Learning with Incremental Iterative Regularization »
Lorenzo Rosasco · Silvia Villa -
2015 Poster: Less is More: Nyström Computational Regularization »
Alessandro Rudi · Raffaello Camoriano · Lorenzo Rosasco -
2015 Oral: Less is More: Nyström Computational Regularization »
Alessandro Rudi · Raffaello Camoriano · Lorenzo Rosasco -
2013 Workshop: Modern Nonparametric Methods in Machine Learning »
Arthur Gretton · Mladen Kolar · Samory Kpotufe · John Lafferty · Han Liu · Bernhard Schölkopf · Alexander Smola · Rob Nowak · Mikhail Belkin · Lorenzo Rosasco · peter bickel · Yue Zhao -
2013 Poster: On the Sample Complexity of Subspace Learning »
Alessandro Rudi · Guillermo D Canas · Lorenzo Rosasco -
2012 Poster: Learning Manifolds with K-Means and K-Flats »
Guillermo D Canas · Tomaso Poggio · Lorenzo Rosasco -
2012 Poster: Multiclass Learning with Simplex Coding »
Youssef Mroueh · Tomaso Poggio · Lorenzo Rosasco · Jean-Jacques Slotine -
2012 Poster: Learning Probability Measures with respect to Optimal Transport Metrics »
Guillermo D Canas · Lorenzo Rosasco -
2010 Poster: A Primal-Dual Algorithm for Group Sparse Regularization with Overlapping Groups »
Sofia Mosci · Silvia Villa · Alessandro Verri · Lorenzo Rosasco -
2010 Poster: Spectral Regularization for Support Estimation »
Ernesto De Vito · Lorenzo Rosasco · Alessandro Toigo -
2009 Workshop: Kernels for Multiple Outputs and Multi-task Learning: Frequentist and Bayesian Points of View »
Mauricio A Alvarez · Lorenzo Rosasco · Neil D Lawrence -
2009 Poster: On Invariance in Hierarchical Models »
Jake Bouvrie · Lorenzo Rosasco · Tomaso Poggio