Timezone: »
Oral
Fast global convergence rates of gradient methods for high-dimensional statistical recovery
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright
Many statistical $M$-estimators are based on convex optimization
problems formed by the weighted sum of a loss function with a
norm-based regularizer. We analyze the convergence rates of
first-order gradient methods for solving such problems within a
high-dimensional framework that allows the data dimension $d$ to grow
with (and possibly exceed) the sample size $n$. This high-dimensional
structure precludes the usual global assumptions---namely, strong
convexity and smoothness conditions---that underlie classical
optimization analysis. We define appropriately restricted versions of
these conditions, and show that they are satisfied with high
probability for various statistical models. Under these conditions,
our theory guarantees that Nesterov's first-order
method~\cite{Nesterov07} has a globally geometric rate of convergence
up to the statistical precision of the model, meaning the typical
Euclidean distance between the true unknown parameter $\theta^*$ and
the optimal solution $\widehat{\theta}$. This globally linear rate is
substantially faster than previous analyses of global convergence for
specific methods that yielded only sublinear rates. Our analysis
applies to a wide range of $M$-estimators and statistical models,
including sparse linear regression using Lasso ($\ell_1$-regularized
regression), group Lasso, block sparsity, and low-rank matrix recovery
using nuclear norm regularization. Overall, this result reveals an
interesting connection between statistical precision and computational
efficiency in high-dimensional estimation.
problems formed by the weighted sum of a loss function with a
norm-based regularizer. We analyze the convergence rates of
first-order gradient methods for solving such problems within a
high-dimensional framework that allows the data dimension $d$ to grow
with (and possibly exceed) the sample size $n$. This high-dimensional
structure precludes the usual global assumptions---namely, strong
convexity and smoothness conditions---that underlie classical
optimization analysis. We define appropriately restricted versions of
these conditions, and show that they are satisfied with high
probability for various statistical models. Under these conditions,
our theory guarantees that Nesterov's first-order
method~\cite{Nesterov07} has a globally geometric rate of convergence
up to the statistical precision of the model, meaning the typical
Euclidean distance between the true unknown parameter $\theta^*$ and
the optimal solution $\widehat{\theta}$. This globally linear rate is
substantially faster than previous analyses of global convergence for
specific methods that yielded only sublinear rates. Our analysis
applies to a wide range of $M$-estimators and statistical models,
including sparse linear regression using Lasso ($\ell_1$-regularized
regression), group Lasso, block sparsity, and low-rank matrix recovery
using nuclear norm regularization. Overall, this result reveals an
interesting connection between statistical precision and computational
efficiency in high-dimensional estimation.
Author Information
Alekh Agarwal (Google Research)
Sahand N Negahban (University of California, Berkeley)
Martin J Wainwright (UC Berkeley)
Related Events (a corresponding poster, oral, or spotlight)
-
2010 Poster: Fast global convergence rates of gradient methods for high-dimensional statistical recovery »
Tue. Dec 7th 08:00 -- 08:00 AM Room
More from the Same Authors
-
2022 : Provable Benefits of Representational Transfer in Reinforcement Learning »
Alekh Agarwal · Yuda Song · Kaiwen Wang · Mengdi Wang · Wen Sun · Xuezhou Zhang -
2023 Poster: Ordering-based Conditions for Global Convergence of Policy Gradient Methods »
Jincheng Mei · Bo Dai · Alekh Agarwal · Mohammad Ghavamzadeh · Csaba Szepesvari · Dale Schuurmans -
2023 Oral: Ordering-based Conditions for Global Convergence of Policy Gradient Methods »
Jincheng Mei · Bo Dai · Alekh Agarwal · Mohammad Ghavamzadeh · Csaba Szepesvari · Dale Schuurmans -
2022 Poster: On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL »
Jinglin Chen · Aditya Modi · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal -
2022 Poster: Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity »
Alekh Agarwal · Tong Zhang -
2021 Poster: Bellman-consistent Pessimism for Offline Reinforcement Learning »
Tengyang Xie · Ching-An Cheng · Nan Jiang · Paul Mineiro · Alekh Agarwal -
2021 Poster: Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning »
Andrea Zanette · Martin J Wainwright · Emma Brunskill -
2021 Oral: Bellman-consistent Pessimism for Offline Reinforcement Learning »
Tengyang Xie · Ching-An Cheng · Nan Jiang · Paul Mineiro · Alekh Agarwal -
2020 Poster: Policy Improvement via Imitation of Multiple Oracles »
Ching-An Cheng · Andrey Kolobov · Alekh Agarwal -
2020 Spotlight: Policy Improvement via Imitation of Multiple Oracles »
Ching-An Cheng · Andrey Kolobov · Alekh Agarwal -
2020 Poster: FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs »
Alekh Agarwal · Sham Kakade · Akshay Krishnamurthy · Wen Sun -
2020 Poster: PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning »
Alekh Agarwal · Mikael Henaff · Sham Kakade · Wen Sun -
2020 Oral: FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs »
Alekh Agarwal · Sham Kakade · Akshay Krishnamurthy · Wen Sun -
2020 Poster: Safe Reinforcement Learning via Curriculum Induction »
Matteo Turchetta · Andrey Kolobov · Shital Shah · Andreas Krause · Alekh Agarwal -
2020 Poster: Provably Good Batch Reinforcement Learning Without Great Exploration »
Yao Liu · Adith Swaminathan · Alekh Agarwal · Emma Brunskill -
2020 Spotlight: Safe Reinforcement Learning via Curriculum Induction »
Matteo Turchetta · Andrey Kolobov · Shital Shah · Andreas Krause · Alekh Agarwal -
2019 Poster: Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting »
Aditya Grover · Jiaming Song · Ashish Kapoor · Kenneth Tran · Alekh Agarwal · Eric Horvitz · Stefano Ermon -
2018 Poster: On Oracle-Efficient PAC RL with Rich Observations »
Christoph Dann · Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2018 Spotlight: On Oracle-Efficient PAC RL with Rich Observations »
Christoph Dann · Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2017 Workshop: OPT 2017: Optimization for Machine Learning »
Suvrit Sra · Sashank J. Reddi · Alekh Agarwal · Benjamin Recht -
2017 Poster: Off-policy evaluation for slate recommendation »
Adith Swaminathan · Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik · John Langford · Damien Jose · Imed Zitouni -
2017 Oral: Off-policy evaluation for slate recommendation »
Adith Swaminathan · Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik · John Langford · Damien Jose · Imed Zitouni -
2017 Poster: Kernel Feature Selection via Conditional Covariance Minimization »
Jianbo Chen · Mitchell Stern · Martin J Wainwright · Michael Jordan -
2016 Demonstration: Project Malmo - Minecraft for AI Research »
Katja Hofmann · Matthew A Johnson · Fernando Diaz · Alekh Agarwal · Tim Hutton · David Bignell · Evelyne Viegas -
2016 Poster: Efficient Second Order Online Learning by Sketching »
Haipeng Luo · Alekh Agarwal · Nicolò Cesa-Bianchi · John Langford -
2016 Poster: Contextual semibandits via supervised learning oracles »
Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik -
2016 Poster: Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences »
Chi Jin · Yuchen Zhang · Sivaraman Balakrishnan · Martin J Wainwright · Michael Jordan -
2016 Poster: PAC Reinforcement Learning with Rich Observations »
Akshay Krishnamurthy · Alekh Agarwal · John Langford -
2015 Workshop: Optimization for Machine Learning (OPT2015) »
Suvrit Sra · Alekh Agarwal · Leon Bottou · Sashank J. Reddi -
2015 Poster: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2015 Spotlight: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2015 Poster: Fast Convergence of Regularized Learning in Games »
Vasilis Syrgkanis · Alekh Agarwal · Haipeng Luo · Robert Schapire -
2015 Oral: Fast Convergence of Regularized Learning in Games »
Vasilis Syrgkanis · Alekh Agarwal · Haipeng Luo · Robert Schapire -
2014 Workshop: OPT2014: Optimization for Machine Learning »
Zaid Harchaoui · Suvrit Sra · Alekh Agarwal · Martin Jaggi · Miro Dudik · Aaditya Ramdas · Jean Lasserre · Yoshua Bengio · Amir Beck -
2014 Poster: Scalable Non-linear Learning with Adaptive Polynomial Expansions »
Alekh Agarwal · Alina Beygelzimer · Daniel Hsu · John Langford · Matus J Telgarsky -
2013 Workshop: Learning Faster From Easy Data »
Peter Grünwald · Wouter M Koolen · Sasha Rakhlin · Nati Srebro · Alekh Agarwal · Karthik Sridharan · Tim van Erven · Sebastien Bubeck -
2013 Workshop: OPT2013: Optimization for Machine Learning »
Suvrit Sra · Alekh Agarwal -
2012 Workshop: Optimization for Machine Learning »
Suvrit Sra · Alekh Agarwal -
2012 Poster: Iterative ranking from pair-wise comparisons »
Sahand N Negahban · Sewoong Oh · Devavrat Shah -
2012 Poster: Privacy Aware Learning »
John Duchi · Michael Jordan · Martin J Wainwright -
2012 Poster: Communication-Efficient Algorithms for Statistical Optimization »
Yuchen Zhang · John Duchi · Martin J Wainwright -
2012 Poster: No voodoo here! Learning discrete graphical models via inverse covariance estimation »
Po-Ling Loh · Martin J Wainwright -
2012 Oral: No voodoo here! Learning discrete graphical models via inverse covariance estimation »
Po-Ling Loh · Martin J Wainwright -
2012 Spotlight: Iterative ranking from pair-wise comparisons »
Sahand N Negahban · Sewoong Oh · Devavrat Shah -
2012 Oral: Privacy Aware Learning »
John Duchi · Michael Jordan · Martin J Wainwright -
2012 Poster: Stochastic optimization and sparse statistical recovery: Optimal algorithms for high dimensions »
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright -
2012 Poster: Finite Sample Convergence Rates of Zero-Order Stochastic Optimization Methods »
John Duchi · Michael Jordan · Martin J Wainwright · Andre Wibisono -
2011 Workshop: Computational Trade-offs in Statistical Learning »
Alekh Agarwal · Sasha Rakhlin -
2011 Poster: Distributed Delayed Stochastic Optimization »
Alekh Agarwal · John Duchi -
2011 Poster: Stochastic convex optimization with bandit feedback »
Alekh Agarwal · Dean P Foster · Daniel Hsu · Sham M Kakade · Sasha Rakhlin -
2011 Poster: A More Powerful Two-Sample Test in High Dimensions using Random Projection »
Miles Lopes · Laurent Jacob · Martin J Wainwright -
2011 Poster: High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity »
Po-Ling Loh · Martin J Wainwright -
2011 Oral: High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity »
Po-Ling Loh · Martin J Wainwright -
2010 Workshop: Learning on Cores, Clusters, and Clouds »
Alekh Agarwal · Lawrence Cayton · Ofer Dekel · John Duchi · John Langford -
2010 Spotlight: Distributed Dual Averaging In Networks »
John Duchi · Alekh Agarwal · Martin J Wainwright -
2010 Poster: Distributed Dual Averaging In Networks »
John Duchi · Alekh Agarwal · Martin J Wainwright -
2009 Poster: Information-theoretic lower bounds on the oracle complexity of convex optimization »
Alekh Agarwal · Peter Bartlett · Pradeep Ravikumar · Martin J Wainwright -
2009 Poster: Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness »
Garvesh Raskutti · Martin J Wainwright · Bin Yu -
2009 Spotlight: Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness »
Garvesh Raskutti · Martin J Wainwright · Bin Yu -
2009 Spotlight: Information-theoretic lower bounds on the oracle complexity of convex optimization »
Alekh Agarwal · Peter Bartlett · Pradeep Ravikumar · Martin J Wainwright -
2009 Poster: A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers »
Sahand N Negahban · Pradeep Ravikumar · Martin J Wainwright · Bin Yu -
2009 Oral: A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers »
Sahand N Negahban · Pradeep Ravikumar · Martin J Wainwright · Bin Yu -
2008 Poster: High-dimensional union support recovery in multivariate regression »
Guillaume R Obozinski · Martin J Wainwright · Michael Jordan -
2008 Poster: Phase transitions for high-dimensional joint support recovery »
Sahand N Negahban · Martin J Wainwright -
2008 Spotlight: High-dimensional union support recovery in multivariate regression »
Guillaume R Obozinski · Martin J Wainwright · Michael Jordan -
2008 Spotlight: Phase transitions for high-dimensional joint support recovery »
Sahand N Negahban · Martin J Wainwright -
2008 Poster: Model Selection in Gaussian Graphical Models: High-Dimensional Consistency of \ell_1-regularizedMLE »
Pradeep Ravikumar · Garvesh Raskutti · Martin J Wainwright · Bin Yu -
2007 Poster: An Analysis of Inference with the Universum »
Fabian H Sinz · Olivier Chapelle · Alekh Agarwal · Bernhard Schölkopf -
2007 Spotlight: An Analysis of Inference with the Universum »
Fabian H Sinz · Olivier Chapelle · Alekh Agarwal · Bernhard Schölkopf -
2007 Spotlight: Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization »
XuanLong Nguyen · Martin J Wainwright · Michael Jordan -
2007 Poster: Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization »
XuanLong Nguyen · Martin J Wainwright · Michael Jordan -
2007 Poster: Loop Series and Bethe Variational Bounds in Attractive Graphical Models »
Erik Sudderth · Martin J Wainwright · Alan S Willsky -
2006 Poster: Inferring Graphical Model Structure using $\ell_1$-Regularized Pseudo-Likelihood »
Martin J Wainwright · Pradeep Ravikumar · John Lafferty -
2006 Spotlight: Inferring Graphical Model Structure using $\ell_1$-Regularized Pseudo-Likelihood »
Martin J Wainwright · Pradeep Ravikumar · John Lafferty