Timezone: »
Poster
Network size and size of the weights in memorization with two-layers neural networks
Sebastien Bubeck · Ronen Eldan · Yin Tat Lee · Dan Mikulincer
In 1988, Eric B. Baum showed that two-layers neural networks with threshold activation function can perfectly memorize the binary labels of $n$ points in general position in $\R^d$ using only $\ulcorner n/d \urcorner$ neurons. We observe that with ReLU networks, using four times as many neurons one can fit arbitrary real labels. Moreover, for approximate memorization up to error $\epsilon$, the neural tangent kernel can also memorize with only $O\left(\frac{n}{d} \cdot \log(1/\epsilon) \right)$ neurons (assuming that the data is well dispersed too). We show however that these constructions give rise to networks where the \emph{magnitude} of the neurons' weights are far from optimal. In contrast we propose a new training procedure for ReLU networks, based on {\em complex} (as opposed to {\em real}) recombination of the neurons, for which we show approximate memorization with both $O\left(\frac{n}{d} \cdot \frac{\log(1/\epsilon)}{\epsilon}\right)$ neurons, as well as nearly-optimal size of the weights.
Author Information
Sebastien Bubeck (Microsoft Research)
Ronen Eldan (Weizmann)
Yin Tat Lee (UW)
Dan Mikulincer (Weizmann Institute)
More from the Same Authors
-
2021 Spotlight: Numerical Composition of Differential Privacy »
Sivakanth Gopi · Yin Tat Lee · Lukas Wutschitz -
2021 Spotlight: A single gradient step finds adversarial examples on random two-layers neural networks »
Sebastien Bubeck · Yeshwanth Cherapanamjeri · Gauthier Gidel · Remi Tachet des Combes -
2021 Spotlight: Private Non-smooth ERM and SCO in Subquadratic Steps »
Janardhan Kulkarni · Yin Tat Lee · Daogao Liu -
2023 Poster: Learning threshold neurons via edge of stability »
Kwangjun Ahn · Sebastien Bubeck · Sinho Chewi · Yin Tat Lee · Felipe Suarez · Yi Zhang -
2022 Spotlight: Lightning Talks 5B-2 »
Conglong Li · Mohammad Azizmalayeri · Mojan Javaheripi · Pratik Vaishnavi · Jon Hasselgren · Hao Lu · Kevin Eykholt · Arshia Soltani Moakhar · Wenze Liu · Gustavo de Rosa · Nikolai Hofmann · Minjia Zhang · Zixuan Ye · Jacob Munkberg · Amir Rahmati · Arman Zarei · Subhabrata Mukherjee · Yuxiong He · Shital Shah · Reihaneh Zohrabi · Hongtao Fu · Tomasz Religa · Yuliang Liu · Mohammad Manzuri · Mohammad Hossein Rohban · Zhiguo Cao · Caio Cesar Teodoro Mendes · Sebastien Bubeck · Farinaz Koushanfar · Debadeepta Dey -
2022 Spotlight: LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models »
Mojan Javaheripi · Gustavo de Rosa · Subhabrata Mukherjee · Shital Shah · Tomasz Religa · Caio Cesar Teodoro Mendes · Sebastien Bubeck · Farinaz Koushanfar · Debadeepta Dey -
2022 Poster: Size and depth of monotone neural networks: interpolation and approximation »
Dan Mikulincer · Daniel Reichman -
2022 Poster: Archimedes Meets Privacy: On Privately Estimating Quantiles in High Dimensions Under Minimal Assumptions »
Omri Ben-Eliezer · Dan Mikulincer · Ilias Zadik -
2022 Poster: A gradient sampling method with complexity guarantees for Lipschitz functions in high and low dimensions »
Damek Davis · Dmitriy Drusvyatskiy · Yin Tat Lee · Swati Padmanabhan · Guanghao Ye -
2022 Poster: Decomposable Non-Smooth Convex Optimization with Nearly-Linear Gradient Oracle Complexity »
Sally Dong · Haotian Jiang · Yin Tat Lee · Swati Padmanabhan · Guanghao Ye -
2022 Poster: LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models »
Mojan Javaheripi · Gustavo de Rosa · Subhabrata Mukherjee · Shital Shah · Tomasz Religa · Caio Cesar Teodoro Mendes · Sebastien Bubeck · Farinaz Koushanfar · Debadeepta Dey -
2021 Poster: Private Non-smooth ERM and SCO in Subquadratic Steps »
Janardhan Kulkarni · Yin Tat Lee · Daogao Liu -
2021 Poster: Lower Bounds on Metropolized Sampling Methods for Well-Conditioned Distributions »
Yin Tat Lee · Ruoqi Shen · Kevin Tian -
2021 Poster: Fast and Memory Efficient Differentially Private-SGD via JL Projections »
Zhiqi Bu · Sivakanth Gopi · Janardhan Kulkarni · Yin Tat Lee · Judy Hanwen Shen · Uthaipon Tantipongpipat -
2021 Poster: Numerical Composition of Differential Privacy »
Sivakanth Gopi · Yin Tat Lee · Lukas Wutschitz -
2021 Poster: Adversarial Examples in Multi-Layer Random ReLU Networks »
Peter Bartlett · Sebastien Bubeck · Yeshwanth Cherapanamjeri -
2021 Poster: A single gradient step finds adversarial examples on random two-layers neural networks »
Sebastien Bubeck · Yeshwanth Cherapanamjeri · Gauthier Gidel · Remi Tachet des Combes -
2021 Poster: A Universal Law of Robustness via Isoperimetry »
Sebastien Bubeck · Mark Sellke -
2021 Oral: Lower Bounds on Metropolized Sampling Methods for Well-Conditioned Distributions »
Yin Tat Lee · Ruoqi Shen · Kevin Tian -
2021 Oral: A Universal Law of Robustness via Isoperimetry »
Sebastien Bubeck · Mark Sellke -
2020 Poster: Acceleration with a Ball Optimization Oracle »
Yair Carmon · Arun Jambulapati · Qijia Jiang · Yujia Jin · Yin Tat Lee · Aaron Sidford · Kevin Tian -
2020 Oral: Acceleration with a Ball Optimization Oracle »
Yair Carmon · Arun Jambulapati · Qijia Jiang · Yujia Jin · Yin Tat Lee · Aaron Sidford · Kevin Tian -
2019 Poster: Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers »
Hadi Salman · Jerry Li · Ilya Razenshteyn · Pengchuan Zhang · Huan Zhang · Sebastien Bubeck · Greg Yang -
2019 Spotlight: Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers »
Hadi Salman · Jerry Li · Ilya Razenshteyn · Pengchuan Zhang · Huan Zhang · Sebastien Bubeck · Greg Yang -
2019 Poster: The Randomized Midpoint Method for Log-Concave Sampling »
Ruoqi Shen · Yin Tat Lee -
2019 Poster: Complexity of Highly Parallel Non-Smooth Convex Optimization »
Sebastien Bubeck · Qijia Jiang · Yin-Tat Lee · Yuanzhi Li · Aaron Sidford -
2019 Spotlight: The Randomized Midpoint Method for Log-Concave Sampling »
Ruoqi Shen · Yin Tat Lee -
2019 Spotlight: Complexity of Highly Parallel Non-Smooth Convex Optimization »
Sebastien Bubeck · Qijia Jiang · Yin-Tat Lee · Yuanzhi Li · Aaron Sidford -
2018 Poster: Optimal Algorithms for Non-Smooth Distributed Optimization in Networks »
Kevin Scaman · Francis Bach · Sebastien Bubeck · Laurent Massoulié · Yin Tat Lee -
2018 Oral: Optimal Algorithms for Non-Smooth Distributed Optimization in Networks »
Kevin Scaman · Francis Bach · Sebastien Bubeck · Laurent Massoulié · Yin Tat Lee -
2018 Poster: Is Q-Learning Provably Efficient? »
Chi Jin · Zeyuan Allen-Zhu · Sebastien Bubeck · Michael Jordan -
2015 Poster: Finite-Time Analysis of Projected Langevin Monte Carlo »
Sebastien Bubeck · Ronen Eldan · Joseph Lehec -
2015 Poster: Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff »
Ofer Dekel · Ronen Eldan · Tomer Koren -
2015 Spotlight: Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff »
Ofer Dekel · Ronen Eldan · Tomer Koren