Timezone: »

 
Poster
Bayesian Distributed Stochastic Gradient Descent
Michael Teng · Frank Wood

Tue Dec 04 07:45 AM -- 09:45 AM (PST) @ Room 210 #61

We introduce Bayesian distributed stochastic gradient descent (BDSGD), a high-throughput algorithm for training deep neural networks on parallel clusters. This algorithm uses amortized inference in a deep generative model to perform joint posterior predictive inference of mini-batch gradient computation times in a compute cluster specific manner. Specifically, our algorithm mitigates the straggler effect in synchronous, gradient-based optimization by choosing an optimal cutoff beyond which mini-batch gradient messages from slow workers are ignored. In our experiments, we show that eagerly discarding the mini-batch gradient computations of stragglers not only increases throughput but actually increases the overall rate of convergence as a function of wall-clock time by virtue of eliminating idleness. The principal novel contribution and finding of this work goes beyond this by demonstrating that using the predicted run-times from a generative model of cluster worker performance improves substantially over the static-cutoff prior art, leading to reduced deep neural net training times on large computer clusters.

Author Information

Michael Teng (University of Oxford (visiting at University of British Columbia))
Frank Wood (University of British Columbia)

More from the Same Authors

  • 2021 : TITRATED: Learned Human Driving Behavior without Infractions via Amortized Inference »
    Vasileios Lioutas · Adam Scibior · Frank Wood
  • 2021 : A Closer Look at Gradient Estimators with Reinforcement Learning as Inference »
    Jonathan Lavington · Michael Teng · Mark Schmidt · Frank Wood
  • 2022 : Physics aware inference for the cryo-EM inverse problem: anisotropic network model heterogeneity, global 3D pose and microscope defocus »
    Geoffrey Woollard · Shayan Shekarforoush · Frank Wood · Marcus Brubaker · Khanh Dao Duc
  • 2022 Poster: BayesPCN: A Continually Learnable Predictive Coding Associative Memory »
    Jinsoo Yoo · Frank Wood
  • 2022 Poster: Flexible Diffusion Modeling of Long Videos »
    William Harvey · Saeid Naderiparizi · Vaden Masrani · Christian Weilbach · Frank Wood
  • 2020 Poster: Gaussian Process Bandit Optimization of the Thermodynamic Variational Objective »
    Vu Nguyen · Vaden Masrani · Rob Brekelmans · Michael A Osborne · Frank Wood
  • 2019 : Opening Remarks »
    Atilim Gunes Baydin · Juan Carrasquilla · Shirley Ho · Karthik Kashinath · Michela Paganini · Savannah Thais · Anima Anandkumar · Kyle Cranmer · Roger Melko · Mr. Prabhat · Frank Wood
  • 2019 Workshop: Machine Learning and the Physical Sciences »
    Atilim Gunes Baydin · Juan Carrasquilla · Shirley Ho · Karthik Kashinath · Michela Paganini · Savannah Thais · Anima Anandkumar · Kyle Cranmer · Roger Melko · Mr. Prabhat · Frank Wood
  • 2019 : Poster session »
    Sebastian Farquhar · Erik Daxberger · Andreas Look · Matt Benatan · Ruiyi Zhang · Marton Havasi · Fredrik Gustafsson · James A Brofos · Nabeel Seedat · Micha Livne · Ivan Ustyuzhaninov · Adam Cobb · Felix D McGregor · Patrick McClure · Tim R. Davidson · Gaurush Hiranandani · Sanjeev Arora · Masha Itkina · Didrik Nielsen · William Harvey · Matias Valdenegro-Toro · Stefano Peluchetti · Riccardo Moriconi · Tianyu Cui · Vaclav Smidl · Taylan Cemgil · Jack Fitzsimons · He Zhao · · mariana vargas vieyra · Apratim Bhattacharyya · Rahul Sharma · Geoffroy Dubourg-Felonneau · Jonathan Warrell · Slava Voloshynovskiy · Mihaela Rosca · Jiaming Song · Andrew Ross · Homa Fashandi · Ruiqi Gao · Hooshmand Shokri Razaghi · Joshua Chang · Zhenzhong Xiao · Vanessa Boehm · Giorgio Giannone · Ranganath Krishnan · Joe Davison · Arsenii Ashukha · Jeremiah Liu · Sicong (Sheldon) Huang · Evgenii Nikishin · Sunho Park · Nilesh Ahuja · Mahesh Subedar · · Artyom Gadetsky · Jhosimar Arias Figueroa · Tim G. J. Rudner · Waseem Aslam · Adrián Csiszárik · John Moberg · Ali Hebbal · Kathrin Grosse · Pekka Marttinen · Bang An · Hlynur Jónsson · Samuel Kessler · Abhishek Kumar · Mikhail Figurnov · Omesh Tickoo · Steindor Saemundsson · Ari Heljakka · Dániel Varga · Niklas Heim · Simone Rossi · Max Laves · Waseem Gharbieh · Nicholas Roberts · Luis Armando Pérez Rey · Matthew Willetts · Prithvijit Chakrabarty · Sumedh Ghaisas · Carl Shneider · Wray Buntine · Kamil Adamczewski · Xavier Gitiaux · Suwen Lin · Hao Fu · Gunnar Rätsch · Aidan Gomez · Erik Bodin · Dinh Phung · Lennart Svensson · Juliano Tusi Amaral Laganá Pinto · Milad Alizadeh · Jianzhun Du · Kevin Murphy · Beatrix Benkő · Shashaank Vattikuti · Jonathan Gordon · Christopher Kanan · Sontje Ihler · Darin Graham · Michael Teng · Louis Kirsch · Tomas Pevny · Taras Holotyak
  • 2019 Poster: The Thermodynamic Variational Objective »
    Vaden Masrani · Tuan Anh Le · Frank Wood
  • 2019 Poster: Efficient Probabilistic Inference in the Quest for Physics Beyond the Standard Model »
    Atilim Gunes Baydin · Lei Shao · Wahid Bhimji · Lukas Heinrich · Saeid Naderiparizi · Andreas Munk · Jialin Liu · Bradley Gram-Hansen · Gilles Louppe · Lawrence Meadows · Philip Torr · Victor Lee · Kyle Cranmer · Mr. Prabhat · Frank Wood
  • 2018 Poster: Faithful Inversion of Generative Models for Effective Amortized Inference »
    Stefan Webb · Adam Golinski · Rob Zinkov · Siddharth N · Thomas Rainforth · Yee Whye Teh · Frank Wood