Timezone: »

An Improved Analysis of Gradient Tracking for Decentralized Machine Learning
Anastasiia Koloskova · Tao Lin · Sebastian Stich

Tue Dec 07 04:30 PM -- 06:00 PM (PST) @
We consider decentralized machine learning over a network where the training data is distributed across $n$ agents, each of which can compute stochastic model updates on their local data. The agent's common goal is to find a model that minimizes the average of all local loss functions. While gradient tracking (GT) algorithms can overcome a key challenge, namely accounting for differences between workers' local data distributions, the known convergence rates for GT algorithms are not optimal with respect to their dependence on the mixing parameter $p$ (related to the spectral gap of the connectivity matrix).We provide a tighter analysis of the GT method in the stochastic strongly convex, convex and non-convex settings. We improve the dependency on $p$ from $\mathcal{O}(p^{-2})$ to $\mathcal{O}(p^{-1}c^{-1})$ in the noiseless case and from $\mathcal{O}(p^{-3/2})$ to $\mathcal{O}(p^{-1/2}c^{-1})$ in the general stochastic case, where $c \geq p$ is related to the negative eigenvalues of the connectivity matrix (and is a constant in most practical applications). This improvement was possible due to a new proof technique which could be of independent interest.

Author Information

Anastasiia Koloskova (EPFL)
Tao Lin (EPFL)
Sebastian Stich (CISPA)

Dr. [Sebastian U. Stich](https://sstich.ch/) is a faculty at the CISPA Helmholtz Center for Information Security. Research interests: - *methods for machine learning and statistics*—at the interface of theory and practice - *collaborative learning* (distributed, federated and decentralized methods) - *optimization for machine learning* (adaptive stochastic methods and generalization performance)

More from the Same Authors