Timezone: »
Accelerating Adaptive Federated Optimization with Local Gossip Communications
Yujia Wang · Pei Fang · Jinghui Chen
Event URL: https://openreview.net/forum?id=wwXb1qmcBuD »
Recently, adaptive federated optimization methods, such as FedAdam and FedAMSGrad, have gained increasing attention for their fast convergence and stable performance especially in training models with heavy-tail stochastic gradient distributions. However, the implementation of such methods still faces several bottlenecks, such as the large client-to-server communication overhead and the intense sensitivity to heterogeneous data. More importantly, the two objectives may conflict with each other, i.e., the convergence rate gets worse as the number of local steps increases in the partial participation setting, making it challenging to further improve the efficiency of adaptive federated optimization. We refer this problem as the \textit{dilemma of local steps}. In this paper, we propose a novel hybrid adaptive federated optimization method (HA-Fed) where the clients are partitioned into disjoint clusters inside which they can communicate by fast client-to-client links. We show that HA-Fed resolves the \textit{dilemma of local steps} in prior adaptive federated optimization methods, i.e., achieves a faster convergence rate as the local steps increases, while reducing the client-to-server communication overhead under non-i.i.d. settings. Specifically, HA-Fed improves the convergence rate from $\mathcal{O}(\sqrt{\tau}/\sqrt{TM})$ in FedAMSGrad to $\mathcal{O}(1/\sqrt{T\tau M})$ in partial participation scenarios under nonconvex stochastic setting. Extensive experiments and ablation studies demonstrate the effectiveness and broad applicability of our proposed method.
Recently, adaptive federated optimization methods, such as FedAdam and FedAMSGrad, have gained increasing attention for their fast convergence and stable performance especially in training models with heavy-tail stochastic gradient distributions. However, the implementation of such methods still faces several bottlenecks, such as the large client-to-server communication overhead and the intense sensitivity to heterogeneous data. More importantly, the two objectives may conflict with each other, i.e., the convergence rate gets worse as the number of local steps increases in the partial participation setting, making it challenging to further improve the efficiency of adaptive federated optimization. We refer this problem as the \textit{dilemma of local steps}. In this paper, we propose a novel hybrid adaptive federated optimization method (HA-Fed) where the clients are partitioned into disjoint clusters inside which they can communicate by fast client-to-client links. We show that HA-Fed resolves the \textit{dilemma of local steps} in prior adaptive federated optimization methods, i.e., achieves a faster convergence rate as the local steps increases, while reducing the client-to-server communication overhead under non-i.i.d. settings. Specifically, HA-Fed improves the convergence rate from $\mathcal{O}(\sqrt{\tau}/\sqrt{TM})$ in FedAMSGrad to $\mathcal{O}(1/\sqrt{T\tau M})$ in partial participation scenarios under nonconvex stochastic setting. Extensive experiments and ablation studies demonstrate the effectiveness and broad applicability of our proposed method.
Author Information
Yujia Wang (Penn State University)
Pei Fang
Jinghui Chen (Penn State University)
More from the Same Authors
-
2022 : On the Vulnerability of Backdoor Defenses for Federated Learning »
Pei Fang · Jinghui Chen -
2022 : Spectrum Guided Topology Augmentation for Graph Contrastive Learning »
Lu Lin · Jinghui Chen · Hongning Wang -
2022 : How Powerful is Implicit Denoising in Graph Neural Networks »
Songtao Liu · Rex Ying · Hanze Dong · Lu Lin · Jinghui Chen · Dinghao Wu -
2022 Poster: One-shot Neural Backdoor Erasing via Adversarial Weight Masking »
Shuwen Chai · Jinghui Chen