Timezone: »

SAGDA: Achieving $\mathcal{O}(\epsilon^{-2})$ Communication Complexity in Federated Min-Max Learning
Haibo Yang · Zhuqing Liu · Xin Zhang · Jia Liu

Tue Nov 29 09:00 AM -- 11:00 AM (PST) @ Hall J #835
Federated min-max learning has received increasing attention in recent years thanks to its wide range of applications in various learning paradigms. Similar to the conventional federated learning for empirical risk minimization problems, communication complexity also emerges as one of the most critical concerns that affects the future prospect of federated min-max learning. To lower the communication complexity of federated min-max learning, a natural approach is to utilize the idea of infrequent communications (through multiple local updates) same as in conventional federated learning. However, due to the more complicated inter-outer problem structure in federated min-max learning, theoretical understandings of communication complexity for federated min-max learning with infrequent communications remain very limited in the literature. This is particularly true for settings with non-i.i.d. datasets and partial client participation. To address this challenge, in this paper, we propose a new algorithmic framework called \ul{s}tochastic \ul{s}ampling \ul{a}veraging \ul{g}radient \ul{d}escent \ul{a}scent ($\mathsf{SAGDA}$), which i) assembles stochastic gradient estimators from randomly sampled clients as control variates and ii) leverages two learning rates on both server and client sides. We show that $\mathsf{SAGDA}$ achieves a linear speedup in terms of both the number of clients and local update steps, which yields an $\mathcal{O}(\epsilon^{-2})$ communication complexity that is orders of magnitude lower than the state of the art. Interestingly, by noting that the standard federated stochastic gradient descent ascent (FSGDA) is in fact a control-variate-free special version of $\mathsf{SAGDA}$, we immediately arrive at an $\mathcal{O}(\epsilon^{-2})$ communication complexity result for FSGDA. Therefore, through the lens of $\mathsf{SAGDA}$, we also advance the current understanding on communication complexity of the standard FSGDA method for federated min-max learning.

Author Information

Haibo Yang (Ohio State University)
Zhuqing Liu (Ohio State University)
Xin Zhang (Facebook)
Jia Liu (The Ohio State University)
Jia Liu

Jia (Kevin) Liu is an Assistant Professor in the Dept. of Electrical and Computer Engineering at The Ohio State University and an Amazon Visiting Academics (AVA). He received his Ph.D. degree from the Dept. of Electrical and Computer Engineering at Virginia Tech in 2010. From Aug. 2017 to Aug. 2020, he was an Assistant Professor in the Dept. of Computer Science at Iowa State University. His research areas include theoretical machine learning, stochastic network optimization and control, and performance analysis for data analytics infrastructure and cyber-physical systems. Dr. Liu is a senior member of IEEE and a member of ACM. He has received numerous awards at top venues, including IEEE INFOCOM'19 Best Paper Award, IEEE INFOCOM'16 Best Paper Award, IEEE INFOCOM'13 Best Paper Runner-up Award, IEEE INFOCOM'11 Best Paper Runner-up Award, IEEE ICC'08 Best Paper Award, and honors of long/spotlight presentations at ICML, NeurIPS, and ICLR. He is an NSF CAREER Award recipient in 2020 and a winner of the Google Faculty Research Award in 2020. He received the LAS Award for Early Achievement in Research at Iowa State University in 2020, and the Bell Labs President Gold Award. His research is supported by NSF, AFOSR, AFRL, and ONR.

More from the Same Authors