Timezone: »
Data with low-dimensional nonlinear structure are ubiquitous in engineering and scientific problems. We study a model problem with such structure---a binary classification task that uses a deep fully-connected neural network to classify data drawn from two disjoint smooth curves on the unit sphere. Aside from mild regularity conditions, we place no restrictions on the configuration of the curves. We prove that when (i) the network depth is large relative to certain geometric properties that set the difficulty of the problem and (ii) the network width and number of samples is polynomial in the depth, randomly-initialized gradient descent quickly learns to correctly classify all points on the two curves with high probability. To our knowledge, this is the first generalization guarantee for deep networks with nonlinear data that depends only on intrinsic data properties. Our analysis proceeds by a reduction to dynamics in the neural tangent kernel (NTK) regime, where the network depth plays the role of a fitting resource in solving the classification problem. In particular, via fine-grained control of the decay properties of the NTK, we demonstrate that when the network is sufficiently deep, the NTK can be locally approximated by a translationally invariant operator on the manifolds and stably inverted over smooth functions, which guarantees convergence and generalization.
Author Information
Tingran Wang (Columbia University)
Sam Buchanan (Columbia University)
Dar Gilboa (Columbia University)
John Wright (Columbia University)
More from the Same Authors
-
2023 Poster: On quantum backpropagation, information reuse, and cheating measurement collapse »
Amira Abbas · Robbie King · Hsin-Yuan Huang · William Huggins · Ramis Movassagh · Dar Gilboa · Jarrod McClean -
2021 Poster: Estimating the Unique Information of Continuous Variables »
Ari Pakman · Amin Nejatbakhsh · Dar Gilboa · Abdullah Makkeh · Luca Mazzucato · Michael Wibral · Elad Schneidman -
2021 Poster: Square Root Principal Component Pursuit: Tuning-Free Noisy Robust Matrix Recovery »
Junhui Zhang · Jingkai Yan · John Wright -
2020 : Deep Networks and the Multiple Manifold Problem »
Samuel Buchanan · Dar Gilboa · John Wright -
2019 Poster: A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-Off »
Yaniv Blumenfeld · Dar Gilboa · Daniel Soudry -
2014 Poster: Finding a sparse vector in a subspace: Linear sparsity using alternating directions »
Qing Qu · Ju Sun · John Wright -
2012 Workshop: Analysis Operator Learning vs. Dictionary Learning: Fraternal Twins in Sparse Modeling »
Martin Kleinsteuber · Francis Bach · Remi Gribonval · John Wright · Simon Hawe -
2012 Poster: Learning with Partially Absorbing Random Walks »
Xiao-Ming Wu · Zhenguo Li · Shih-Fu Chang · John Wright · Anthony Man-Cho So