Invited Talk
in
Workshop: Workshop on Advancing Neural Network Training (WANT): Computational Efficiency, Scalability, and Resource Optimization

Enabling Efficient Trillion Parameter Scale Training for Deep Learning Models

Olatunji Ruwase

2023 Invited Talk
in
Workshop: Workshop on Advancing Neural Network Training (WANT): Computational Efficiency, Scalability, and Resource Optimization

Abstract

Abstract: Deep Learning (DL) is driving unprecedented progress in a wide range of Artificial Intelligence domains, including natural language processing, vision, speech, and multimodal. However, sustaining this AI revolution requires practical solutions to the extreme demands of model scaling on the compute, memory, communication and storage components of modern computing hardware. To address this challenge, we created a deep learning optimization library called DeepSpeed to make distributed model training and inference efficient, effective, and easy on commodity hardware. This talk will focus on DeepSpeed training optimizations, particularly on ZeRO and DeepSpeed-MoE, which help to address the memory and compute requirements of extreme model scaling.

Speaker's Bio: Olatunji (Tunji) Ruwase is a co-founder and Principal Research Sciences Manager of the DeepSpeed project at Microsoft. His broad industry and research background spans compilers, operating systems, and hardware accelerators. He is currently interested in building systems and convergence optimizations, and frameworks for distributed training and inference of deep learning models. His research results on DL training, inference, and hyperparameter search are used in multiple Microsoft systems and products, such as Bing, Ads, HyperDrive, and Catapault.

Video

Chat is not available.