Timezone: »
Spotlight
Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals
Surbhi Goel · Sushrut Karmalkar · Adam Klivans
We consider the problem of computing the best-fitting ReLU with
respect to square-loss on a training set when the examples have been
drawn according to a spherical Gaussian distribution (the labels can
be arbitrary). Let $\opt < 1$ be the population loss of the
best-fitting ReLU. We prove:
\begin{itemize}
\item Finding a ReLU with square-loss $\opt + \epsilon$ is as
hard as the problem of learning sparse parities with noise, widely thought
to be computationally intractable. This is the first hardness
result for learning a ReLU with respect to Gaussian marginals, and
our results imply --{\em unconditionally}-- that gradient descent cannot
converge to the global minimum in polynomial time.
\item There exists an efficient approximation algorithm for finding the
best-fitting ReLU that achieves error $O(\opt^{2/3})$. The
algorithm uses a novel reduction to noisy halfspace learning with
respect to $0/1$ loss.
\end{itemize}
Prior work due to Soltanolkotabi \cite{soltanolkotabi2017learning} showed that gradient descent {\em can} find the best-fitting ReLU with respect to Gaussian marginals, if the training set is {\em exactly} labeled by a ReLU.
Author Information
Surbhi Goel (The University of Texas at Austin)
Sushrut Karmalkar (The University of Texas at Austin)
Adam Klivans (UT Austin)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals »
Thu Dec 12th 01:00 -- 03:00 AM Room East Exhibition Hall B + C
More from the Same Authors
-
2020 Poster: From Boltzmann Machines to Neural Networks and Back Again »
Surbhi Goel · Adam Klivans · Frederic Koehler -
2020 Poster: Statistical-Query Lower Bounds via Functional Gradients »
Surbhi Goel · Aravind Gollakota · Adam Klivans -
2019 Poster: Outlier-Robust High-Dimensional Sparse Estimation via Iterative Filtering »
Ilias Diakonikolas · Daniel Kane · Sushrut Karmalkar · Eric Price · Alistair Stewart -
2019 Poster: List-decodable Linear Regression »
Sushrut Karmalkar · Adam Klivans · Pravesh Kothari -
2019 Spotlight: List-decodable Linear Regression »
Sushrut Karmalkar · Adam Klivans · Pravesh Kothari -
2017 Poster: Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks »
Surbhi Goel · Adam Klivans -
2014 Poster: Sparse Polynomial Learning and Graph Sketching »
Murat Kocaoglu · Karthikeyan Shanmugam · Alexandros Dimakis · Adam Klivans -
2014 Oral: Sparse Polynomial Learning and Graph Sketching »
Murat Kocaoglu · Karthikeyan Shanmugam · Alexandros Dimakis · Adam Klivans