Timezone: »
One main challenge in federated learning is the large communication cost of exchanging weight updates from clients to the server at each round. While prior work has made great progress in compressing the weight updates through gradient compression methods, we propose a radically different approach that does not update the weights. Instead, our method freezes the weights at their initial random values and learns how to sparsify the random network for the best performance. To this end, the clients collaborate in training a \emph{stochastic} binary mask to find the optimal random sparse network within the original one. At the end of the training, the final model is a randomly weighted sparse network -- or a subnetwork inside the random dense network. We show improvements in accuracy, communication bitrate (less than $1$ bit per parameter (bpp)), convergence speed, and final model size (less than $1$ bpp) over relevant baselines on MNIST, EMNIST, CIFAR-10, and CIFAR-100 datasets, in the low bitrate regime under various system configurations.
Author Information
Francesco Pase (University of Padova)
Berivan Isik (Stanford University)
I am a fourth-year PhD student in Electrical Engineering Department at Stanford University, advised by Tsachy Weissman. My research interests are machine learning, information theory, and data compression. Recently, I have been working on model compression, federated learning, learned data compression, and compression for privacy, robustness and fairness in machine learning. My research is supported by Stanford Graduate Fellowship (2019-2023). I received my MS degree from Stanford University in June 2021 and my BS degree from Middle East Technical University in June 2019, both in Electrical Engineering. Previously, I interned at Stanford in 2018 summer as an undergraduate researcher under the supervision of Ayfer Ozgur. In 2021 summer, I worked at Google as a research intern hosted by Philip Chou. In February 2022, I returned to Google as a student researcher and worked on learned video compression until October 2022. I have been working at Amazon as an applied scientist intern since October 2022.
Deniz Gunduz (Imperial College London)
Tsachy Weissman (Stanford University)
Michele Zorzi (University of Padua)
More from the Same Authors
-
2022 : ColRel: Collaborative Relaying for Federated Learning over Intermittently Connected Networks »
Rajarshi Saha · Michal Yemini · Emre Ozfatura · Deniz Gunduz · Andrea Goldsmith -
2023 Poster: Exact Optimality of Communication-Privacy-Utility Tradeoffs in Distributed Mean Estimation »
Berivan Isik · Wei-Ning Chen · Ayfer Ozgur · Tsachy Weissman · Albert No -
2022 Spotlight: Leveraging the Hints: Adaptive Bidding in Repeated First-Price Auctions »
Wei Zhang · Yanjun Han · Zhengyuan Zhou · Aaron Flores · Tsachy Weissman -
2022 Spotlight: Lightning Talks 3B-1 »
Tianying Ji · Tongda Xu · Giulia Denevi · Aibek Alanov · Martin Wistuba · Wei Zhang · Yuesong Shen · Massimiliano Pontil · Vadim Titov · Yan Wang · Yu Luo · Daniel Cremers · Yanjun Han · Arlind Kadra · Dailan He · Josif Grabocka · Zhengyuan Zhou · Fuchun Sun · Carlo Ciliberto · Dmitry Vetrov · Mingxuan Jing · Chenjian Gao · Aaron Flores · Tsachy Weissman · Han Gao · Fengxiang He · Kunzan Liu · Wenbing Huang · Hongwei Qin -
2022 Poster: Leveraging the Hints: Adaptive Bidding in Repeated First-Price Auctions »
Wei Zhang · Yanjun Han · Zhengyuan Zhou · Aaron Flores · Tsachy Weissman -
2019 Workshop: Information Theory and Machine Learning »
Shengjia Zhao · Jiaming Song · Yanjun Han · Kristy Choi · Pratyusha Kalluri · Ben Poole · Alex Dimakis · Jiantao Jiao · Tsachy Weissman · Stefano Ermon -
2018 Poster: Entropy Rate Estimation for Markov Chains with Large State Space »
Yanjun Han · Jiantao Jiao · Chuan-Zheng Lee · Tsachy Weissman · Yihong Wu · Tiancheng Yu -
2018 Spotlight: Entropy Rate Estimation for Markov Chains with Large State Space »
Yanjun Han · Jiantao Jiao · Chuan-Zheng Lee · Tsachy Weissman · Yihong Wu · Tiancheng Yu