Poster
in
Workshop: OPT 2023: Optimization for Machine Learning

An alternative approach to train neural networks using monotone variational inequality

Chen Xu · Xiuyuan Cheng · Yao Xie

Project Page [ Poster] [ OpenReview]

Abstract

We investigate an alternative approach to neural network training, which is a non-convex optimization problem, through the lens of another convex problem — to solve a monotone variational inequality (MVI) - inspired by the work of [Juditsky and Nemirovsky, 2019]. MVI solutions can be found by computationally efficient procedures, with performance guarantee of $\ell_2$ and $\ell_{\infty}$ bounds on model recovery and prediction accuracy under the theoretical setting of training a single-layer linear neural network. We study the use of MVI for training multi-layer neural networks by proposing a practical and completely general algorithm called \textit{stochastic variational inequality} (\texttt{SVI}). We demonstrate its applicability in training networks with various architectures (\texttt{SVI} is completely general for training any network). We show the competitive or better performance of \texttt{SVI} compared to the widely-used stochastic gradient descent method (SGD) on both synthetic and real data prediction tasks regarding various performance metrics, especially in the improved efficiency in the early stage of training.

Chat is not available.