NeurIPS Accelerated PDEs for Construction and Theoretical Analysis of an SGD Extension

Poster
in
Workshop: The Symbiosis of Deep Learning and Differential Equations

Accelerated PDEs for Construction and Theoretical Analysis of an SGD Extension

Yuxin Sun · Dong Lao · Ganesh Sundaramoorthi · Anthony Yezzi

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

We introduce a recently developed framework PDE Acceleration, which is a variational approach to accelerated optimization with partial differential equations (PDE), in the context of optimization of deep networks. We derive the PDE evolution equations for optimization of general loss functions using this variational approach. We propose discretizations of these PDE based on numerical PDE discretizations, and establish a mapping between these discretizations and stochastic gradient descent (SGD). We show that our framework can give rise to new PDEs that can be mapped to new optimization algorithms, and thus theoretical insights from the PDE domain can be used to analyze optimization algorithms. We show an example by introducing a new PDE with diffusion that naturally arises from the viscosity solution, which translates to a novel extension of SGD. We analytically analyze the stability and convergence using Von-Neumann analysis. We apply the proposed extension to optimization of convolutional neural networks (CNNs). We empirically validate the theory and evaluate our new extension on image classification showing empirical improvement over SGD.

Chat is not available.

Poster in Workshop: The Symbiosis of Deep Learning and Differential Equations

Accelerated PDEs for Construction and Theoretical Analysis of an SGD Extension

Yuxin Sun · Dong Lao · Ganesh Sundaramoorthi · Anthony Yezzi

Poster
in
Workshop: The Symbiosis of Deep Learning and Differential Equations