Timezone: »
Poster
Improving Neural Ordinary Differential Equations with Nesterov's Accelerated Gradient Method
Ho Huu Nghia Nguyen · Tan Nguyen · Huyen Vo · Stanley Osher · Thieu Vo
We propose the Nesterov neural ordinary differential equations (NesterovNODEs), whose layers solve the second-order ordinary differential equations (ODEs) limit of Nesterov's accelerated gradient (NAG) method, and a generalization called GNesterovNODEs. Taking the advantage of the convergence rate $\mathcal{O}(1/k^{2})$ of the NAG scheme, GNesterovNODEs speed up training and inference by reducing the number of function evaluations (NFEs) needed to solve the ODEs. We also prove that the adjoint state of a GNesterovNODEs also satisfies a GNesterovNODEs, thus accelerating both forward and backward ODE solvers and allowing the model to be scaled up for large-scale tasks. We empirically corroborate the advantage of GNesterovNODEs on a wide range of practical applications, including point cloud separation, image classification, and sequence modeling. Compared to NODEs, GNesterovNODEs require a significantly smaller number of NFEs while achieving better accuracy across our experiments.
Author Information
Ho Huu Nghia Nguyen (FPT Software Company Limited, FPT Cau Giay Building, Duy Tan Street, Cau Giay District, Ha Noi City)
I recently finished my undergraduate degree in Computer Science at Ho Chi Minh city University of Science, one of the top 2 universities in Vietnam in Computer Science. I am currently an AI Research Resident at the FPT Software AI Residency Program, hosted by FPT Software AI Center. My interests are in Neural ODEs, Equivariant models, and Generalization. I am looking for a PhD position.
Tan Nguyen (University of California, Los Angeles)
Huyen Vo (Hanoi University of Science and Technology)
Stanley Osher (UCLA)
Thieu Vo (Johannes Kepler University Linz)
More from the Same Authors
-
2022 Poster: FourierFormer: Transformer Meets Generalized Fourier Integral Theorem »
Tan Nguyen · Minh Pham · Tam Nguyen · Khai Nguyen · Stanley Osher · Nhat Ho -
2022 Poster: Improving Transformer with an Admixture of Attention Heads »
Tan Nguyen · Tam Nguyen · Hai Do · Khai Nguyen · Vishwanath Saragadam · Minh Pham · Khuong Duy Nguyen · Nhat Ho · Stanley Osher -
2021 : Stan Osher Talk »
Stanley Osher -
2021 Poster: FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention »
Tan Nguyen · Vai Suliafu · Stanley Osher · Long Chen · Bao Wang -
2021 Poster: Heavy Ball Neural Ordinary Differential Equations »
Hedi Xia · Vai Suliafu · Hangjie Ji · Tan Nguyen · Andrea Bertozzi · Stanley Osher · Bao Wang -
2020 Poster: MomentumRNN: Integrating Momentum into Recurrent Neural Networks »
Tan Nguyen · Richard Baraniuk · Andrea Bertozzi · Stanley Osher · Bao Wang -
2019 Poster: ResNets Ensemble via the Feynman-Kac Formalism to Improve Natural and Robust Accuracies »
Bao Wang · Zuoqiang Shi · Stanley Osher -
2018 Poster: Deep Neural Nets with Interpolating Function as Output Activation »
Bao Wang · Xiyang Luo · Zhen Li · Wei Zhu · Zuoqiang Shi · Stanley Osher