Timezone: »

Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks
Xiao Sun · Jungwook Choi · Chia-Yu Chen · Naigang Wang · Swagath Venkataramani · Vijayalakshmi (Viji) Srinivasan · Xiaodong Cui · Wei Zhang · Kailash Gopalakrishnan

Thu Dec 12 10:45 AM -- 12:45 PM (PST) @ East Exhibition Hall B + C #159

Reducing the numerical precision of data and computation is extremely effective in accelerating deep learning training workloads. Towards this end, 8-bit floating point representations (FP8) were recently proposed for DNN training. However, its applicability was demonstrated on a few selected models only and significant degradation is observed when popular networks such as MobileNet and Transformer are trained using FP8. This degradation is due to the inherent precision requirement difference in the forward and backward passes of DNN training. Using theoretical insights, we propose a hybrid FP8 (HFP8) format and DNN end-to-end distributed training procedure. We demonstrate, using HFP8, the successful training of deep learning models across a whole spectrum of applications including Image Classification, Object Detection, Language and Speech without accuracy degradation. Finally, we demonstrate that, by using the new 8 bit format, we can directly quantize a pre-trained model down to 8-bits without losing accuracy by simply fine-tuning batch normalization statistics. These novel techniques enable a new generations of 8-bit hardware that are robust for building and deploying neural network models.

Author Information

Xiao Sun (IBM Thomas J. Watson Research Center)
Jungwook Choi (Hanyang University)
Chia-Yu Chen (IBM research)

my research areas focus on: accelerator architecture compiler design and library development machine learning and neural network VLSI and nano device

Naigang Wang (IBM T. J. Watson Research Center)
Swagath Venkataramani (IBM Research)
Vijayalakshmi (Viji) Srinivasan (IBM TJ Watson)
Xiaodong Cui (IBM T. J. Watson Research Center)
Wei Zhang (IBM T.J.Watson Research Center)

BE Beijing Univ of Technology 2005 MSc Technical University of Denmark 2008 PhD University of Wisconsin, Madison 2013 All in computer science Published papers in ASPLOS, OOPSLA, OSDI, PLDI, IJCAI, ICDM, NIPS

Kailash Gopalakrishnan (IBM Research)

More from the Same Authors