Timezone: »

LOFT: Finding Lottery Tickets through Filter-wise Training
Qihan Wang · Chen Dun · Fangshuo Liao · Christopher Jermaine · Anastasios Kyrillidis
Event URL: https://openreview.net/forum?id=X1N9YExjEF »
In this paper, we explore how one can efficiently identify the emergence of ``winning tickets'' using distributed training techniques, and use this observation to design efficient pretraining algorithms. Our focus in this work is on convolutional neural networks (CNNs), which are more complex than simple multi-layer perceptrons, but simple enough to exposure our ideas. To identify good filters within winning tickets, we propose a novel filter distance metric that well-represents the model convergence, without the need to know the true winning ticket or fully training the model. Our filter analysis behaves consistently with recent findings of neural network learning dynamics. Motivated by such analysis, we present the \emph{LOttery ticket through Filter-wise Training} algorithm, dubbed as \textsc{LoFT}. \textsc{LoFT} is a model-parallel pretraining algorithm that partitions convolutional layers in CNNs by filters to train them independently on different distributed workers, leading to reduced memory and communication costs during pretraining. Experiments show that \textsc{LoFT} $i)$ preserves and finds good lottery tickets, while $ii)$ it achieves non-trivial savings in computation and communication, and maintains comparable or even better accuracy than other pretraining methods.

Author Information

Qihan Wang (Rice University)
Chen Dun (Rice University)
Fangshuo Liao (Rice University)
Christopher Jermaine (Rice University)
Anastasios Kyrillidis (Rice University)

More from the Same Authors