Timezone: »

Poster
Breaking the centralized barrier for cross-device federated learning
Sai Praneeth Karimireddy · Martin Jaggi · Satyen Kale · Mehryar Mohri · Sashank Reddi · Sebastian Stich · Ananda Theertha Suresh

Thu Dec 09 08:30 AM -- 10:00 AM (PST) @ Virtual

Federated learning (FL) is a challenging setting for optimization due to the heterogeneity of the data across different clients which gives rise to the client drift phenomenon. In fact, obtaining an algorithm for FL which is uniformly better than simple centralized training has been a major open problem thus far. In this work, we propose a general algorithmic framework, Mime, which i) mitigates client drift and ii) adapts arbitrary centralized optimization algorithms such as momentum and Adam to the cross-device federated learning setting. Mime uses a combination of control-variates and server-level statistics (e.g. momentum) at every client-update step to ensure that each local update mimics that of the centralized method run on iid data. We prove a reduction result showing that Mime can translate the convergence of a generic algorithm in the centralized setting into convergence in the federated setting. Further, we show that when combined with momentum based variance reduction, Mime is provably faster than any centralized method--the first such result. We also perform a thorough experimental exploration of Mime's performance on real world datasets.

#### Author Information

##### Sai Praneeth Karimireddy (EPFL)

I am a second year PhD student working in convex and non-convex optimization with Prof. Martin Jaggi. My focus is on designing faster and more scalable optimization algorithms for machine learning. Some of my preliminary results and problems I am currently working on- 1. Robust accelerated algorithms - Nesterov acceleration modified to be robust to noise. 2. Faster algorithms which take second order information about the function into account. 3. A $O(1/t^2)$ rate *affine invariant* algorithm for constrained optimization. 4. Frank-Wolfe algorithm for non-smooth functions using 'noisy-smoothing'