NeurIPS Poster Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit

Poster

Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit

Jason Lee · Kazusato Oko · Taiji Suzuki · Denny Wu

West Ballroom A-D #7201

[ Abstract ]

[ Paper] [ OpenReview]

Fri 13 Dec 11 a.m. PST — 2 p.m. PST

Abstract: We study the problem of gradient descent learning of a single-index target function

f_{*} (x) = σ_{*} (⟨ x, θ ⟩)

$f_*(\boldsymbol{x}) = \textstyle\sigma_*\left(\langle\boldsymbol{x},\boldsymbol{\theta}\rangle\right)$ under isotropic Gaussian data in

R^{d}

$\mathbb{R}^d$ , where the unknown link function

σ_{*} : R \to R

$\sigma_*:\mathbb{R}\to\mathbb{R}$ has information exponent

p

$p$ (defined as the lowest degree in the Hermite expansion). Prior works showed that gradient-based training of neural networks can learn this target with

n ≳ d^{Θ (p)}

$n\gtrsim d^{\Theta(p)}$ samples, and such complexity is predicted to be necessary by the correlational statistical query lower bound. Surprisingly, we prove that a two-layer neural network optimized by an SGD-based algorithm (on the squared loss) learns

f_{*}

$f_*$ with a complexity that is not governed by the information exponent. Specifically, for arbitrary polynomial single-index models, we establish a sample and runtime complexity of

n ≃ T = Θ (d \cdot p o l y l o g d)

$n \simeq T = \Theta(d\cdot\mathrm{polylog} d)$ , where

Θ (\cdot)

$\Theta(\cdot)$ hides a constant only depending on the degree of

σ_{*}

$\sigma_*$ ; this dimension dependence matches the information theoretic limit up to polylogarithmic factors. More generally, we show that

n ≳ d^{(p_{*} - 1) \lor 1}

$n\gtrsim d^{(p_*-1)\vee 1}$ samples are sufficient to achieve low generalization error, where

p_{*} \leq p

$p_* \le p$ is the \textit{generative exponent} of the link function. Core to our analysis is the reuse of minibatch in the gradient computation, which gives rise to higher-order information beyond correlational queries.

Chat is not available.