Timezone: »

On the universality of deep learning
Emmanuel Abbe · Colin Sandon

Wed Dec 09 09:00 AM -- 11:00 AM (PST) @ Poster Session 3 #1112

This paper shows that deep learning, i.e., neural networks trained by SGD, can learn in polytime any function class that can be learned in polytime by some algorithm, including parities. This universal result is further shown to be robust, i.e., it holds under possibly poly-noise on the gradients, which gives a separation between deep learning and statistical query algorithms, as the latter are not comparably universal due to cases like parities. This also shows that SGD-based deep learning does not suffer from the limitations of the perceptron discussed by Minsky-Papert '69. The paper further complement this result with a lower-bound on the generalization error of descent algorithms, which implies in particular that the robust universality breaks down if the gradients are averaged over large enough batches of samples as in full-GD, rather than fewer samples as in SGD.

Author Information

Emmanuel Abbe (Princeton University)
Colin Sandon (MIT)

More from the Same Authors