Timezone: »

Decentralized Learning with Random Walks and Communication-Efficient Adaptive Optimization
Aleksei Triastcyn · Matthias Reisser · Christos Louizos

We tackle the problem of federated learning (FL) in a peer-to-peer fashion without a central server. While prior work mainly considered gossip-style protocols for learning, our solution is based on random walks. This allows to communicate only to a single peer at a time, thereby reducing the total communication and enabling asynchronous execution. To improve convergence and reduce the need for extensive tuning, we consider an adaptive optimization method -- Adam. Two extensions reduce its communication costs: state compression and multiple local updates on each client. We theoretically analyse the convergence behaviour of the proposed algorithm and its modifications in the non-convex setting. We show that our method can achieve performance comparable to centralized FL without communication overhead. Empirical results are reported on a variety of tasks (vision, text), neural network architectures and large-scale federations (up to $\sim342$k clients).