Skip to yearly menu bar Skip to main content

Workshop: OPT 2023: Optimization for Machine Learning

Variance Reduced Model Based Methods: New rates and adaptive step sizes

Robert Gower · Frederik Kunstner · Mark Schmidt

Abstract: Variance reduced gradients methods were introduced to control the variance of SGD (Stochastic Gradient Descent). Model-based methods are able to make use of a known lower bound on the loss, for instance, most loss functions are positive. We show how these two classes of methods can be seamlessly combined. As an example we present a Model-based Stochastic Average Gradient method MSAG, which results from using a truncated model together with the SAG method. At each iteration MSAG computes an adaptive learning rate based on a given known lower bound. When given access to the optimal objective as the lower bound, MSAG has several favorable convergence properties, including monotonic iterates, and convergence in the non-smooth, smooth and strongly convex setting. This shows that we can essentially trade-off knowing the smoothness constant $L_{\max}$ for knowing the optimal objective to achieve the favourable convergence of variance reduced gradient methods.

Chat is not available.