Timezone: »

Training DNNs with Hybrid Block Floating Point
Mario Drumond · Tao Lin · Martin Jaggi · Babak Falsafi

Tue Dec 04 07:45 AM -- 09:45 AM (PST) @ Room 517 AB #135

The wide adoption of DNNs has given birth to unrelenting computing requirements, forcing datacenter operators to adopt domain-specific accelerators to train them. These accelerators typically employ densely packed full-precision floating-point arithmetic to maximize performance per area. Ongoing research efforts seek to further increase that performance density by replacing floating-point with fixed-point arithmetic. However, a significant roadblock for these attempts has been fixed point's narrow dynamic range, which is insufficient for DNN training convergence. We identify block floating point (BFP) as a promising alternative representation since it exhibits wide dynamic range and enables the majority of DNN operations to be performed with fixed-point logic. Unfortunately, BFP alone introduces several limitations that preclude its direct applicability. In this work, we introduce HBFP, a hybrid BFP-FP approach, which performs all dot products in BFP and other operations in floating point. HBFP delivers the best of both worlds: the high accuracy of floating point at the superior hardware density of fixed point. For a wide variety of models, we show that HBFP matches floating point's accuracy while enabling hardware implementations that deliver up to 8.5x higher throughput.

Author Information

Mario Drumond (EPFL)
Tao Lin (EPFL)
Martin Jaggi (EPFL)
Babak Falsafi (EcoCloud, EPFL)

Babak is a Professor in the School of Computer and Communication Sciences and the founding director of the EcoCloud research center at EPFL. He has made numerous contributions to computer system design and evaluation including multiprocessor architecture for the WildCat/WildFire severs by Sun Microsystems (now Oracle), memory prefetching technologies in IBM BlueGene and ARM cores, and server evaluation methodologies used by AMD, HPE and Google PerfKit. His recent work on workload-optimized server processors lays the foundation for Cavium ThunderX. He is a recipient of a number of distinctions including a Sloan Research Fellowship. He is a fellow of ACM and IEEE.

More from the Same Authors