Poster
in
Workshop: Optimization for ML Workshop
High Dimensional First Order Mini-Batch Algorithms on Quadratic Problems
Andrew Cheng · Kiwon Lee · Courtney Paquette
Abstract:
We analyze the dynamics of general mini-batch first order algorithms on the ℓ2ℓ2 regularized least squares problem when the number of samples and dimensions are large. This includes stochastic gradient descent (SGD), stochastic Nesterov (convex/strongly convex), and stochastic momentum. In this setting, we show that the dynamics of these algorithms concentrate to a deterministic discrete Volterra equation ΨΨ in the high-dimensional limit. In turn, we show that we can use ΨΨ to capture the behaviour of general mini-batch first order algorithm under any quadratic statistics R:Rd→RR:Rd→R, including but not limited to: training loss, excess risk for empirical risk minimization (in-distribution and generalization error.
Chat is not available.