Timezone: »

Oral
Fast global convergence rates of gradient methods for high-dimensional statistical recovery
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright

Tue Dec 07 02:50 PM -- 03:10 PM (PST) @ Regency Ballroom
Many statistical $M$-estimators are based on convex optimization
problems formed by the weighted sum of a loss function with a
norm-based regularizer. We analyze the convergence rates of
first-order gradient methods for solving such problems within a
high-dimensional framework that allows the data dimension $d$ to grow
with (and possibly exceed) the sample size $n$. This high-dimensional
structure precludes the usual global assumptions---namely, strong
convexity and smoothness conditions---that underlie classical
optimization analysis. We define appropriately restricted versions of
these conditions, and show that they are satisfied with high
probability for various statistical models. Under these conditions,
our theory guarantees that Nesterov's first-order
method~\cite{Nesterov07} has a globally geometric rate of convergence
up to the statistical precision of the model, meaning the typical
Euclidean distance between the true unknown parameter $\theta^*$ and
the optimal solution $\widehat{\theta}$. This globally linear rate is
substantially faster than previous analyses of global convergence for
specific methods that yielded only sublinear rates. Our analysis
applies to a wide range of $M$-estimators and statistical models,
including sparse linear regression using Lasso ($\ell_1$-regularized
regression), group Lasso, block sparsity, and low-rank matrix recovery
using nuclear norm regularization. Overall, this result reveals an
interesting connection between statistical precision and computational
efficiency in high-dimensional estimation.