Skip to yearly menu bar Skip to main content


Poster

Double Quantization for Communication-Efficient Distributed Optimization

Yue Yu · Jiaxiang Wu · Longbo Huang

East Exhibition Hall B + C #128

Keywords: [ Optimization -> Convex Optimization; Optimization -> Non-Convex Optimization; Optimization ] [ Stochastic Optimization ] [ Large Scale Learning ] [ Algorithms ]


Abstract:

Modern distributed training of machine learning models often suffers from high communication overhead for synchronizing stochastic gradients and model parameters. In this paper, to reduce the communication complexity, we propose \emph{double quantization}, a general scheme for quantizing both model parameters and gradients. Three communication-efficient algorithms are proposed based on this general scheme. Specifically, (i) we propose a low-precision algorithm AsyLPG with asynchronous parallelism, (ii) we explore integrating gradient sparsification with double quantization and develop Sparse-AsyLPG, (iii) we show that double quantization can be accelerated by the momentum technique and design accelerated AsyLPG. We establish rigorous performance guarantees for the algorithms, and conduct experiments on a multi-server test-bed with real-world datasets to demonstrate that our algorithms can effectively save transmitted bits without performance degradation, and significantly outperform existing methods with either model parameter or gradient quantization.

Live content is unavailable. Log in and register to view live content