Timezone: »
The ever-increasing computational complexity of deep learning models makes their training and deployment difficult on various cloud and edge platforms. Replacing floating-point arithmetic with low-bit integer arithmetic is a promising approach to save energy, memory footprint, and latency of deep learning models. As such, quantization has attracted the attention of researchers in recent years. However, using integer numbers to form a fully functional integer training pipeline including forward pass, back-propagation, and stochastic gradient descent is not studied in detail. Our empirical and mathematical results reveal that integer arithmetic seems to be enough to train deep learning models. Unlike recent proposals, instead of quantization, we directly switch the number representation of computations. Our novel training method forms a fully integer training pipeline that does not change the trajectory of the loss and accuracy compared to floating-point, nor does it need any special hyper-parameter tuning, distribution adjustment, or gradient clipping. Our experimental results show that our proposed method is effective in a wide variety of tasks such as classification (including vision transformers), object detection, and semantic segmentation.
Author Information
Alireza Ghaffari (McGill University)
Marzieh S. Tahaei (McGill University)
Mohammadreza Tayaranian
Masoud Asgharian (McGill University)
Vahid Partovi Nia (Huawei Noah's Ark Lab)
More from the Same Authors
-
2021 : Compressing Pre-trained Language Models using Progressive Low Rank Decomposition »
Habib Hajimolahoseini · Mehdi Rezaghoizadeh · Vahid Partovi Nia · Marzieh Tahaei · Omar Mohamed Awad · Yang Liu -
2021 : Kronecker Decomposition for GPT Compression »
Ali Edalati · Marzieh Tahaei · Ahmad Rashid · Vahid Partovi Nia · James J. Clark · Mehdi Rezaghoizadeh -
2022 : Strategies for Applying Low Rank Decomposition to Transformer-Based Models »
Habib Hajimolahoseini · Walid Ahmed · Mehdi Rezaghoizadeh · Vahid Partovi Nia · Yang Liu -
2021 Poster: Demystifying and Generalizing BinaryConnect »
Tim Dockhorn · Yaoliang Yu · Eyyüb Sari · Mahdi Zolnouri · Vahid Partovi Nia -
2021 Poster: S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks »
Xinlin Li · Bang Liu · Yaoliang Yu · Wulong Liu · Chunjing XU · Vahid Partovi Nia -
2018 : Poster session »
David Zeng · Marzieh S. Tahaei · Shuai Chen · Felix Meister · Meet Shah · Anant Gupta · Ajil Jalal · Eirini Arvaniti · David Zimmerer · Konstantinos Kamnitsas · Pedro Ballester · Nathaniel Braman · Udaya Kumar · Sil C. van de Leemput · Junaid Qadir · Hoel Kervadec · Mohamed Akrout · Adrian Tousignant · Matthew Ng · Raghav Mehta · Miguel Monteiro · Sumana Basu · Jonas Adler · Adrian Dalca · Jizong Peng · Sungyeob Han · Xiaoxiao Li · Karthik Gopinath · Joseph Cheng · Bogdan Georgescu · Kha Gia Quach · Karthik Sarma · David Van Veen