Timezone: »

Efficient Algorithms for Device Placement of DNN Graph Operators
Jakub Tarnawski · Amar Phanishayee · Nikhil Devanur · Divya Mahajan · Fanny Nina Paravecino

Wed Dec 09 09:00 AM -- 11:00 AM (PST) @ Poster Session 3 #1043

Modern machine learning workloads use large models, with complex structures, that are very expensive to execute. The devices that execute complex models are becoming increasingly heterogeneous as we see a flourishing of Domain Specific Architectures (DSAs) being offered as hardware accelerators in addition to CPUs. These trends necessitate distributing the workload across multiple devices. Recent work has shown that significant gains can be obtained with model parallelism, i.e, partitioning a neural network's computational graph onto multiple devices. In particular, this form of parallelism assumes a pipeline of devices, which is fed a stream of samples and yields high throughput for training and inference of DNNs. However, for such settings (large models and multiple heterogeneous devices), we require automated algorithms and toolchains that can partition the ML workload across devices.

In this paper, we identify and isolate the structured optimization problem at the core of device placement of DNN operators, for both inference and training, especially in modern pipelined settings. We then provide algorithms that solve this problem to optimality. We demonstrate the applicability and efficiency of our approaches using several contemporary DNN computation graphs.

Author Information

Jakub Tarnawski (Microsoft Research)
Amar Phanishayee (Microsoft Research)
Nikhil Devanur (Amazon)
Divya Mahajan (Microsoft)
Fanny Nina Paravecino (Microsoft)

Dr. Nina-Paravecino is currently a Senior Researcher at the AI and Advance Architectures group in Microsoft, where she leads different efforts to improve performance of Deep Learning workloads. Previously, Dr. Nina-Paravecino was part of Intel Corporation as a Research Scientist to push Intel’s ground-breaking volumetric reconstruction technology using Deep Learning. In the past, her work has contributed to efficiently exploit GPU architectures and enabled identification of bottlenecks on a myriad of applications that includes image processing and video analytics. Dr Nina-Paravecino received her Ph.D. in Computer Engineering from Northeastern University, her M.Sc. in Computer Engineering from University of Puerto Rico at Mayaguez Campus, and her B.S. in System and Informatics Engineering from University of San Antonio Abad of Cusco – Peru. She has been PC-member/Reviewer of different Journals/Conferences/Workshops such as IEEE Transactions on Image Processing 2017, JPDC 2017, CF 2018, PPoPP 2018, SC 2018, GPGPU 2018, PARCO 2018, IA^3 2019, SC 2019, DAC 2020, ICCD 2020, HPCA 2021. Most recently, Dr. Nina was co-chair of the Video Analytics mini-track at HICSS 2020

More from the Same Authors