NeurIPS 2019 Schedule

( events) Timezone:

Poster

Thu Dec 12 05:00 PM -- 07:00 PM (PST) @ East Exhibition Hall B + C #203

PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization

In Deep Learning -- Optimization for Deep Networks

Thijs Vogels · Sai Praneeth Karimireddy · Martin Jaggi

[ Paper] [ Poster] [ 3 min Video]

We study gradient compression methods to alleviate the communication bottleneck in data-parallel distributed optimization. Despite the significant attention received, current compression schemes either do not scale well, or fail to achieve the target test accuracy. We propose a low-rank gradient compressor that can i) compress gradients rapidly, ii) efficiently aggregate the compressed gradients using all-reduce, and iii) achieve test performance on par with SGD. The proposed algorithm is the only method evaluated that achieves consistent wall-clock speedups when benchmarked against regular SGD with an optimized communication backend. We demonstrate reduced training times for convolutional networks as well as LSTMs on common datasets.