Skip to yearly menu bar Skip to main content

Workshop: OPT 2022: Optimization for Machine Learning

Gradient dynamics of single-neuron autoencoders on orthogonal data

Nikhil Ghosh · Spencer Frei · Wooseok Ha · Bin Yu


In this work we investigate the dynamics of (stochastic) gradient descent when training a single-neuron ReLU autoencoder on orthogonal inputs. We show that for this non-convex problem there exists a manifold of global minima all with the same maximum Hessian eigenvalue and that gradient descent reaches a particular global minimum when initialized randomly. Interestingly, which minimum is reached depends heavily on the batch-size. For full batch gradient descent, the directions of the neuron that are initially positively correlated with the data are merely rescaled uniformly, hence in high-dimensions the learned neuron is a near uniform mixture of these directions. On the other hand, with batch-size one the neuron exactly aligns with a single such direction, showing that when using a small batch-size a qualitatively different type of ``feature selection" occurs.

Chat is not available.