Poster
in
Workshop: Has it Trained Yet? A Workshop for Algorithmic Efficiency in Practical Neural Network Training

LOFT: Finding Lottery Tickets through Filter-wise Training

Qihan Wang · Chen Dun · Fangshuo Liao · Christopher Jermaine · Anastasios Kyrillidis

Project Page [ Poster] [ OpenReview]

Abstract

In this paper, we explore how one can efficiently identify the emergence of ``winning tickets'' using distributed training techniques, and use this observation to design efficient pretraining algorithms. Our focus in this work is on convolutional neural networks (CNNs), which are more complex than simple multi-layer perceptrons, but simple enough to exposure our ideas. To identify good filters within winning tickets, we propose a novel filter distance metric that well-represents the model convergence, without the need to know the true winning ticket or fully training the model. Our filter analysis behaves consistently with recent findings of neural network learning dynamics. Motivated by such analysis, we present the \emph{LOttery ticket through Filter-wise Training} algorithm, dubbed as \textsc{LoFT}. \textsc{LoFT} is a model-parallel pretraining algorithm that partitions convolutional layers in CNNs by filters to train them independently on different distributed workers, leading to reduced memory and communication costs during pretraining. Experiments show that \textsc{LoFT} $i)$ preserves and finds good lottery tickets, while $ii)$ it achieves non-trivial savings in computation and communication, and maintains comparable or even better accuracy than other pretraining methods.

Chat is not available.