NeurIPS SwiftLearn: A Data-Efficient Training Method of Deep Learning Models using Importance Sampling

Poster
in
Workshop: Third Workshop on Efficient Natural Language and Speech Processing (ENLSP-III): Towards the Future of Large Language Models and their Emerging Descendants

SwiftLearn: A Data-Efficient Training Method of Deep Learning Models using Importance Sampling

Habib Hajimolahoseini · Omar Mohamed Awad · Walid Ahmed · Austin Wen · Saina Asani · Mohammad Hassanpour · Farnoosh Javadi · Mehdi Ahmadi · Foozhan Ataiefard · Kangling Liu · Yang Liu

[ Abstract ]

Abstract:

In this paper, we present SwiftLearn, a data-efficient approach to accelerate trainingof deep learning models using a subset of data samples selected during the warm-upstages of training. This subset is selected based on an importance criteria measuredover the entire dataset during warm-up stages, aiming to preserve the modelperformance with fewer examples during the rest of training. The importancemeasure we propose could be updated during training every once in a while, tomake sure that all of the data samples have a chance to return to the training loopif they show a higher importance. The model architecture is unchanged but sincethe number of data samples controls the number of forward and backward passesduring training, we can reduce the training time by reducing the number of trainingsamples used in each epoch of training. Experimental results on a variety of CVand NLP models during both pre-training and fine-tuning show that the modelperformance could be preserved while achieving a significant speed-up duringtraining. More specifically, BERT finetuning on GLUE benchmark shows that almost 90% of the data can be dropped achieving an end-to-end average speedup of 3.36x while keeping the average accuracy drop less than 0.92% .

Chat is not available.

Poster in Workshop: Third Workshop on Efficient Natural Language and Speech Processing (ENLSP-III): Towards the Future of Large Language Models and their Emerging Descendants

SwiftLearn: A Data-Efficient Training Method of Deep Learning Models using Importance Sampling

Habib Hajimolahoseini · Omar Mohamed Awad · Walid Ahmed · Austin Wen · Saina Asani · Mohammad Hassanpour · Farnoosh Javadi · Mehdi Ahmadi · Foozhan Ataiefard · Kangling Liu · Yang Liu

Poster
in
Workshop: Third Workshop on Efficient Natural Language and Speech Processing (ENLSP-III): Towards the Future of Large Language Models and their Emerging Descendants