Timezone: »
Poster
Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks
Grant Rotskoff · Eric Vanden-Eijnden
The performance of neural networks on high-dimensional data
distributions suggests that it may be possible to parameterize a
representation of a given high-dimensional function with
controllably small errors, potentially outperforming standard
interpolation methods. We demonstrate, both theoretically and
numerically, that this is indeed the case. We map the parameters of
a neural network to a system of particles relaxing with an
interaction potential determined by the loss function. We show that
in the limit that the number of parameters $n$ is large, the
landscape of the mean-squared error becomes convex and the
representation error in the function scales as $O(n^{-1})$.
In this limit, we prove a dynamical variant of the universal
approximation theorem showing that the optimal
representation can be attained by stochastic gradient
descent, the algorithm ubiquitously used for parameter optimization
in machine learning. In the asymptotic regime, we study the
fluctuations around the optimal representation and show that they
arise at a scale $O(n^{-1})$. These fluctuations in the landscape
identify the natural scale for the noise in stochastic gradient
descent. Our results apply to both single and multi-layer neural
networks, as well as standard kernel methods like radial basis
functions.
Author Information
Grant Rotskoff (New York University)
Eric Vanden-Eijnden (New York University)
More from the Same Authors
-
2022 Poster: Learning Optimal Flows for Non-Equilibrium Importance Sampling »
Yu Cao · Eric Vanden-Eijnden -
2023 Poster: Efficient Training of Energy-Based Models Using Jarzinsky Equality »
Davide Carbone · Mengjian Hua · Simon Coste · Eric Vanden-Eijnden -
2022 Poster: Learning sparse features can lead to overfitting in neural networks »
Leonardo Petrini · Francesco Cagnetta · Eric Vanden-Eijnden · Matthieu Wyart -
2020 Poster: A mean-field analysis of two-player zero-sum games »
Carles Domingo-Enrich · Samy Jelassi · Arthur Mensch · Grant Rotskoff · Joan Bruna -
2020 Poster: Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions »
Stefano Sarao Mannelli · Eric Vanden-Eijnden · Lenka Zdeborová -
2020 Poster: A Dynamical Central Limit Theorem for Shallow Neural Networks »
Zhengdao Chen · Grant Rotskoff · Joan Bruna · Eric Vanden-Eijnden