Timezone: »

Efficient Algorithms for Learning Depth-2 Neural Networks with General ReLU Activations
Pranjal Awasthi · Alex Tang · Aravindan Vijayaraghavan

Thu Dec 09 04:30 PM -- 06:00 PM (PST) @ None #None
We present polynomial time and sample efficient algorithms for learning an unknown depth-2 feedforward neural network with general ReLU activations, under mild non-degeneracy assumptions. In particular, we consider learning an unknown network of the form $f(x) = {a}^{\mathsf{T}}\sigma({W}^\mathsf{T}x+b)$, where $x$ is drawn from the Gaussian distribution, and $\sigma(t) = \max(t,0)$ is the ReLU activation. Prior works for learning networks with ReLU activations assume that the bias ($b$) is zero. In order to deal with the presence of the bias terms, our proposed algorithm consists of robustly decomposing multiple higher order tensors arising from the Hermite expansion of the function $f(x)$. Using these ideas we also establish identifiability of the network parameters under very mild assumptions.

Author Information

Pranjal Awasthi (Google)
Alex Tang (Northwestern University)
Aravindan Vijayaraghavan (Northwestern University)

More from the Same Authors