Timezone: »
Communication bottleneck has been identified as a significant issue in distributed optimization of large-scale learning models. Recently, several approaches to mitigate this problem have been proposed, including different forms of gradient compression or computing local models and mixing them iteratively. In this paper we propose Qsparse-local-SGD algorithm, which combines aggressive sparsification with quantization and local computation along with error compensation, by keeping track of the difference between the true and compressed gradients. We propose both synchronous and asynchronous implementations of Qsparse-local-SGD. We analyze convergence for Qsparse-local-SGD in the distributed case, for smooth non-convex and convex objective functions. We demonstrate that Qsparse-local-SGD converges at the same rate as vanilla distributed SGD for many important classes of sparsifiers and quantizers. We use Qsparse-local-SGD to train ResNet-50 on ImageNet, and show that it results in significant savings over the state-of-the-art, in the number of bits transmitted to reach target accuracy.
Author Information
Debraj Basu (Adobe Inc.)
Deepesh Data (UCLA)
Can Karakus (Amazon Web Services)
Suhas Diggavi (UCLA)
More from the Same Authors
-
2021 Poster: Renyi Differential Privacy of The Subsampled Shuffle Model In Distributed Learning »
Antonious Girgis · Deepesh Data · Suhas Diggavi -
2021 Poster: QuPeD: Quantized Personalization via Distillation with Applications to Federated Learning »
Kaan Ozkara · Navjot Singh · Deepesh Data · Suhas Diggavi -
2020 : Poster Session 3 (gather.town) »
Denny Wu · Chengrun Yang · Tolga Ergen · sanae lotfi · Charles Guille-Escuret · Boris Ginsburg · Hanbake Lyu · Cong Xie · David Newton · Debraj Basu · Yewen Wang · James Lucas · MAOJIA LI · Lijun Ding · Jose Javier Gonzalez Ortiz · Reyhane Askari Hemmat · Zhiqi Bu · Neal Lawton · Kiran Thekumparampil · Jiaming Liang · Lindon Roberts · Jingyi Zhu · Dongruo Zhou -
2020 : Contributed Talk #5: Shuffled Model of Federated Learning: Privacy, Accuracy, and Communication Trade-offs »
Deepesh Data -
2017 Poster: Straggler Mitigation in Distributed Optimization Through Data Encoding »
Can Karakus · Yifan Sun · Suhas Diggavi · Wotao Yin -
2017 Spotlight: Straggler Mitigation in Distributed Optimization Through Data Encoding »
Can Karakus · Yifan Sun · Suhas Diggavi · Wotao Yin -
2011 Poster: Randomized Algorithms for Comparison-based Search »
Dominique Tschopp · Suhas Diggavi · Payam Delgosha · Soheil Mohajer