Timezone: »
We prove dimension free representation results for neural networks with D ReLU layers under square loss for a class of functions G_D defined in the paper. These results capture the precise benefits of depth in the following sense:
1. The rates for representing the class of functions G_D via D ReLU layers is sharp up to constants, as shown by matching lower bounds.
2.G_D is a proper subset of G_{D+1} and as D grows the class of functions G_D grow to contain less smooth functions.
3. If D^{\prime} < D, then the approximation rate achieved by depth D^{\prime} networks is strictly worse than that achieved by depth D networks for the class G_D.
This constitutes a fine-grained characterization of the representation power of feedforward networks of arbitrary depth D and number of neurons N, in contrast to existing representation results which either require D growing quickly with N or assume that the function being represented is highly smooth. In the latter case similar rates can be obtained with a single nonlinear layer. Our results confirm the prevailing hypothesis that deeper networks are better at representing less smooth functions, and indeed, the main technical novelty is to fully exploit the fact that deep networks can produce highly oscillatory functions with few activation functions.
Author Information
Guy Bresler (MIT)
Dheeraj Nagaraj (Massachusetts Institute of Technology)
More from the Same Authors
-
2021 Spotlight: Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems »
Suhas Kowshik · Dheeraj Nagaraj · Prateek Jain · Praneeth Netrapalli -
2021 Poster: Streaming Linear System Identification with Reverse Experience Replay »
Suhas Kowshik · Dheeraj Nagaraj · Prateek Jain · Praneeth Netrapalli -
2021 Poster: The staircase property: How hierarchical structure can guide deep learning »
Emmanuel Abbe · Enric Boix-Adsera · Matthew S Brennan · Guy Bresler · Dheeraj Nagaraj -
2021 Poster: Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems »
Suhas Kowshik · Dheeraj Nagaraj · Prateek Jain · Praneeth Netrapalli -
2020 Poster: Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms »
Dheeraj Nagaraj · Xian Wu · Guy Bresler · Prateek Jain · Praneeth Netrapalli -
2020 Spotlight: Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms »
Dheeraj Nagaraj · Xian Wu · Guy Bresler · Prateek Jain · Praneeth Netrapalli -
2020 Poster: Learning Restricted Boltzmann Machines with Sparse Latent Variables »
Guy Bresler · Rares-Darius Buhai -
2019 Poster: Sample Efficient Active Learning of Causal Trees »
Kristjan Greenewald · Dmitriy Katz · Karthikeyan Shanmugam · Sara Magliacane · Murat Kocaoglu · Enric Boix-Adsera · Guy Bresler -
2018 Poster: Sparse PCA from Sparse Linear Regression »
Guy Bresler · Sung Min Park · Madalina Persu -
2017 : Community Detection and Invariance to Distribution »
Guy Bresler