Timezone: »

Variance Reduced ProxSkip: Algorithm, Theory and Application to Federated Learning
Grigory Malinovsky · Kai Yi · Peter Richtarik

Thu Dec 01 02:00 PM -- 04:00 PM (PST) @ Hall J #804
We study distributed optimization methods based on the {\em local training (LT)} paradigm, i.e., methods which achieve communication efficiency by performing richer local gradient-based training on the clients before (expensive) parameter averaging is allowed to take place. While these methods were first proposed about a decade ago, and form the algorithmic backbone of federated learning, there is an enormous gap between their practical performance, and our theoretical understanding. Looking back at the progress of the field, we {\em identify 5 generations of LT methods}: 1) heuristic, 2) homogeneous, 3) sublinear, 4) linear, and 5) accelerated. The 5${}^{\rm th}$ generation was initiated by the ProxSkip method of Mishchenko et al. (2022), whose analysis provided the first theoretical confirmation that LT is a communication acceleration mechanism. Inspired by this recent progress, we contribute to the 5${}^{\rm th}$ generation of LT methods by showing that it is possible to enhance ProxSkip further using {\em variance reduction}. While all previous theoretical results for LT methods ignore the cost of local work altogether, and are framed purely in terms of the number of communication rounds, we construct a method that can be substantially faster in terms of the {\em total training time} than the state-of-the-art method ProxSkip in theory and practice in the regime when local computation is sufficiently expensive. We characterize this threshold theoretically, and confirm our theoretical predictions with empirical results. Our treatment of variance reduction is generic, and can work with a large number of variance reduction techniques, which may lead to future applications in the future. Finally, we corroborate our theoretical results with carefully engineered proof-of-concept experiments.

Author Information

Grigory Malinovsky (King Abdullah University of Science and Technology)
Kai Yi (KAUST)

I’m a PhD student under the supervision of Prof. Peter Richtárik. Before that, I received my Master from KAUST in Dec. 2021 and B.Eng with honor from Xi’an Jiaotong University in June 2019. I’ve interned at Tencent AI Lab, CMU Xulab, NUS CVML Group, and SenseTime.

Peter Richtarik (KAUST)

More from the Same Authors