Keywords: [ Optimization ]

Abstract:
We study the Riemannian gradient method for PCA on which a crucial fact is that despite the simplicity of the considered setting, i.e., deterministic version of Krasulina's method, the convergence rate has not been well-understood yet. In this work, we provide a general tight analysis for the gap-dependent rate at $O(\frac{1}{\Delta}\log\frac{1}{\epsilon})$ that holds for any real symmetric matrix. More importantly, when the gap $\Delta$ is significantly smaller than the target accuracy $\epsilon$ on the objective sub-optimality of the final solution, the rate of this type is actually not tight any more, which calls for a worst-case rate. We further give the first worst-case analysis that achieves a rate of convergence at $O(\frac{1}{\epsilon}\log\frac{1}{\epsilon})$. The two analyses naturally roll out a comprehensively tight convergence rate at $O(\frac{1}{\max\{\Delta,\epsilon\}}\hskip-.3em\log\frac{1}{\epsilon})$. Particularly, our gap-dependent analysis suggests a new promising learning rate for stochastic variance reduced PCA algorithms. Experiments are conducted to confirm our findings as well.

Chat is not available.