Timezone: »

Second-Order Neural ODE Optimizer
Guan-Horng Liu · Tianrong Chen · Evangelos Theodorou

Wed Dec 08 04:30 PM -- 06:00 PM (PST) @

We propose a novel second-order optimization framework for training the emerging deep continuous-time models, specifically the Neural Ordinary Differential Equations (Neural ODEs). Since their training already involves expensive gradient computation by solving a backward ODE, deriving efficient second-order methods becomes highly nontrivial. Nevertheless, inspired by the recent Optimal Control (OC) interpretation of training deep networks, we show that a specific continuous-time OC methodology, called Differential Programming, can be adopted to derive backward ODEs for higher-order derivatives at the same O(1) memory cost. We further explore a low-rank representation of the second-order derivatives and show that it leads to efficient preconditioned updates with the aid of Kronecker-based factorization. The resulting method – named SNOpt – converges much faster than first-order baselines in wall-clock time, and the improvement remains consistent across various applications, e.g. image classification, generative flow, and time-series prediction. Our framework also enables direct architecture optimization, such as the integration time of Neural ODEs, with second-order feedback policies, strengthening the OC perspective as a principled tool of analyzing optimization in deep learning. Our code is available at https://github.com/ghliu/snopt.

Author Information

Guan-Horng Liu (Georgia Institute of Technology)
Tianrong Chen (Georgia Institute of Technology)
Evangelos Theodorou (Georgia Institute of Technology)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors