Timezone: »
Poster
Gradient Sparsification for Communication-Efficient Distributed Optimization
Jianqiao Wangni · Jialei Wang · Ji Liu · Tong Zhang
Modern large-scale machine learning applications require stochastic optimization algorithms to be implemented on distributed computational architectures. A key bottleneck is the communication overhead for exchanging information such as stochastic gradients among different workers. In this paper, to reduce the communication cost, we propose a convex optimization formulation to minimize the coding length of stochastic gradients. The key idea is to randomly drop out coordinates of the stochastic gradient vectors and amplify the remaining coordinates appropriately to ensure the sparsified gradient to be unbiased. To solve the optimal sparsification efficiently, several simple and fast algorithms are proposed for an approximate solution, with a theoretical guarantee for sparseness. Experiments on $\ell_2$ regularized logistic regression, support vector machines, and convolutional neural networks validate our sparsification approaches.
Author Information
Jianqiao Wangni (University of Pennsylvania)
Jialei Wang (Two Sigma Investments, University of Chicago)
Ji Liu (University of Rochester, Tencent AI lab)
Tong Zhang (Tencent AI Lab)
More from the Same Authors
-
2022 : A Neural Tangent Kernel Perspective on Function-Space Regularization in Neural Networks »
Zonghao Chen · Xupeng Shi · Tim G. J. Rudner · Qixuan Feng · Weizhong Zhang · Tong Zhang -
2022 : Particle-based Variational Inference with Preconditioned Functional Gradient Flow »
Hanze Dong · Xi Wang · Yong Lin · Tong Zhang -
2022 : Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint »
Hao Liu · Minshuo Chen · Siawpeng Er · Wenjing Liao · Tong Zhang · Tuo Zhao -
2022 Poster: Improving Certified Robustness via Statistical Learning with Logical Reasoning »
Zhuolin Yang · Zhikuan Zhao · Boxin Wang · Jiawei Zhang · Linyi Li · Hengzhi Pei · Bojan Karlaš · Ji Liu · Heng Guo · Ce Zhang · Bo Li -
2022 Poster: When is the Convergence Time of Langevin Algorithms Dimension Independent? A Composite Optimization Viewpoint »
Yoav S Freund · Yi-An Ma · Tong Zhang -
2022 Poster: Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity »
Alekh Agarwal · Tong Zhang -
2022 Poster: Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions »
Jiafan He · Dongruo Zhou · Tong Zhang · Quanquan Gu -
2021 : HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning »
Ziniu Li · Yingru Li · Yushun Zhang · Tong Zhang · Zhiquan Luo -
2021 : HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning »
Ziniu Li · Yingru Li · Yushun Zhang · Tong Zhang · Zhiquan Luo -
2021 Poster: A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning »
Christoph Dann · Mehryar Mohri · Tong Zhang · Julian Zimmert -
2021 Poster: ErrorCompensatedX: error compensation for variance reduced algorithms »
Hanlin Tang · Yao Li · Ji Liu · Ming Yan -
2021 Poster: Efficient Neural Network Training via Forward and Backward Propagation Sparsification »
Xiao Zhou · Weizhong Zhang · Zonghao Chen · SHIZHE DIAO · Tong Zhang -
2021 Poster: Error Compensated Distributed SGD Can Be Accelerated »
Xun Qian · Peter Richtarik · Tong Zhang -
2021 Poster: TNASP: A Transformer-based NAS Predictor with a Self-evolution Framework »
Shun Lu · Jixiang Li · Jianchao Tan · Sen Yang · Ji Liu -
2021 Poster: Shifted Chunk Transformer for Spatio-Temporal Representational Learning »
Xuefan Zha · Wentao Zhu · Lv Xun · Sen Yang · Ji Liu -
2020 : Invited speaker: The Convexity of Learning Infinite-width Deep Neural Networks, Tong Zhang »
Tong Zhang -
2020 Poster: Model Rubik’s Cube: Twisting Resolution, Depth and Width for TinyNets »
Kai Han · Yunhe Wang · Qiulin Zhang · Wei Zhang · Chunjing XU · Tong Zhang -
2020 Poster: A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks »
Zixiang Chen · Yuan Cao · Quanquan Gu · Tong Zhang -
2020 Poster: Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts »
Guilin Li · Junlei Zhang · Yunhe Wang · Chuanjian Liu · Matthias Tan · Yunfeng Lin · Wei Zhang · Jiashi Feng · Tong Zhang -
2020 Poster: Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems »
Luo Luo · Haishan Ye · Zhichao Huang · Tong Zhang -
2020 Poster: Bridging the Gap between Sample-based and One-shot Neural Architecture Search with BONAS »
Han Shi · Renjie Pi · Hang Xu · Zhenguo Li · James Kwok · Tong Zhang -
2020 Poster: Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness and Accuracy for Free »
Haotao Wang · Tianlong Chen · Shupeng Gui · TingKuei Hu · Ji Liu · Zhangyang Wang -
2020 Poster: Decentralized Accelerated Proximal Gradient Descent »
Haishan Ye · Ziang Zhou · Luo Luo · Tong Zhang -
2020 Poster: How to Characterize The Landscape of Overparameterized Convolutional Neural Networks »
Yihong Gu · Weizhong Zhang · Cong Fang · Jason Lee · Tong Zhang -
2019 Poster: Efficient Smooth Non-Convex Stochastic Compositional Optimization via Stochastic Recursive Gradient Descent »
Wenqing Hu · Chris Junchi Li · Xiangru Lian · Ji Liu · Angela Yuan -
2019 Poster: Divergence-Augmented Policy Optimization »
Qing Wang · Yingru Li · Jiechao Xiong · Tong Zhang -
2019 Poster: Global Sparse Momentum SGD for Pruning Very Deep Neural Networks »
Xiaohan Ding · guiguang ding · Xiangxin Zhou · Yuchen Guo · Jungong Han · Ji Liu -
2019 Poster: LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning »
Yali Du · Lei Han · Meng Fang · Ji Liu · Tianhong Dai · Dacheng Tao -
2019 Poster: Model Compression with Adversarial Robustness: A Unified Optimization Framework »
Shupeng Gui · Haotao Wang · Haichuan Yang · Chen Yu · Zhangyang Wang · Ji Liu -
2018 Poster: Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization »
Blake Woodworth · Jialei Wang · Adam Smith · Brendan McMahan · Nati Srebro -
2018 Spotlight: Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization »
Blake Woodworth · Jialei Wang · Adam Smith · Brendan McMahan · Nati Srebro -
2018 Poster: Communication Compression for Decentralized Training »
Hanlin Tang · Shaoduo Gan · Ce Zhang · Tong Zhang · Ji Liu -
2018 Poster: SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator »
Cong Fang · Chris Junchi Li · Zhouchen Lin · Tong Zhang -
2018 Spotlight: SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator »
Cong Fang · Chris Junchi Li · Zhouchen Lin · Tong Zhang -
2018 Poster: Stochastic Primal-Dual Method for Empirical Risk Minimization with O(1) Per-Iteration Complexity »
Conghui Tan · Tong Zhang · Shiqian Ma · Ji Liu -
2018 Poster: Exponentially Weighted Imitation Learning for Batched Historical Data »
Qing Wang · Jiechao Xiong · Lei Han · peng sun · Han Liu · Tong Zhang -
2017 Poster: Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent »
Xiangru Lian · Ce Zhang · Huan Zhang · Cho-Jui Hsieh · Wei Zhang · Ji Liu -
2017 Oral: Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent »
Xiangru Lian · Ce Zhang · Huan Zhang · Cho-Jui Hsieh · Wei Zhang · Ji Liu -
2017 Poster: Diffusion Approximations for Online Principal Component Estimation and Global Convergence »
Chris Junchi Li · Mengdi Wang · Tong Zhang -
2017 Oral: Diffusion Approximations for Online Principal Component Estimation and Global Convergence »
Chris Junchi Li · Mengdi Wang · Tong Zhang -
2017 Poster: On Quadratic Convergence of DC Proximal Newton Algorithm in Nonconvex Sparse Learning »
Xingguo Li · Lin Yang · Jason Ge · Jarvis Haupt · Tong Zhang · Tuo Zhao -
2016 Poster: Asynchronous Parallel Greedy Coordinate Descent »
Yang You · Xiangru Lian · Ji Liu · Hsiang-Fu Yu · Inderjit Dhillon · James Demmel · Cho-Jui Hsieh -
2016 Poster: Exact Recovery of Hard Thresholding Pursuit »
Xiaotong Yuan · Ping Li · Tong Zhang -
2016 Poster: Efficient Globally Convergent Stochastic Optimization for Canonical Correlation Analysis »
Weiran Wang · Jialei Wang · Dan Garber · Dan Garber · Nati Srebro -
2016 Poster: Accelerating Stochastic Composition Optimization »
Mengdi Wang · Ji Liu · Ethan Fang -
2016 Poster: Learning Additive Exponential Family Graphical Models via $\ell_{2,1}$-norm Regularized M-Estimation »
Xiaotong Yuan · Ping Li · Tong Zhang · Qingshan Liu · Guangcan Liu -
2016 Poster: A Comprehensive Linear Speedup Analysis for Asynchronous Stochastic Parallel Optimization from Zeroth-Order to First-Order »
Xiangru Lian · Huan Zhang · Cho-Jui Hsieh · Yijun Huang · Ji Liu -
2015 Poster: Quartz: Randomized Dual Coordinate Ascent with Arbitrary Sampling »
Zheng Qu · Peter Richtarik · Tong Zhang -
2015 Poster: Local Smoothness in Variance Reduced Optimization »
Daniel Vainsencher · Han Liu · Tong Zhang -
2015 Poster: Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding »
Rie Johnson · Tong Zhang -
2015 Spotlight: Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding »
Rie Johnson · Tong Zhang -
2015 Poster: Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization »
Xiangru Lian · Yijun Huang · Yuncheng Li · Ji Liu -
2015 Spotlight: Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization »
Xiangru Lian · Yijun Huang · Yuncheng Li · Ji Liu -
2014 Poster: Exclusive Feature Learning on Arbitrary Structures via $\ell_{1,2}$-norm »
Deguang Kong · Ryohei Fujimaki · Ji Liu · Feiping Nie · Chris Ding -
2013 Poster: Accelerating Stochastic Gradient Descent using Predictive Variance Reduction »
Rie Johnson · Tong Zhang -
2013 Poster: Accelerated Mini-Batch Stochastic Dual Coordinate Ascent »
Shai Shalev-Shwartz · Tong Zhang -
2013 Poster: An Approximate, Efficient LP Solver for LP Rounding »
Srikrishna Sridhar · Stephen Wright · Christopher Re · Ji Liu · Victor Bittorf · Ce Zhang -
2012 Workshop: Modern Nonparametric Methods in Machine Learning »
Sivaraman Balakrishnan · Arthur Gretton · Mladen Kolar · John Lafferty · Han Liu · Tong Zhang -
2012 Poster: Selective Labeling via Error Bound Minimization »
Quanquan Gu · Tong Zhang · Chris Ding · Jiawei Han -
2012 Poster: Regularized Off-Policy TD-Learning »
Bo Liu · Sridhar Mahadevan · Ji Liu -
2012 Spotlight: Regularized Off-Policy TD-Learning »
Bo Liu · Sridhar Mahadevan · Ji Liu -
2011 Poster: Learning to Search Efficiently in High Dimensions »
Zhen Li · Huazhong Ning · Liangliang Cao · Tong Zhang · Yihong Gong · Thomas S Huang -
2011 Poster: Spectral Methods for Learning Multivariate Latent Tree Structure »
Anima Anandkumar · Kamalika Chaudhuri · Daniel Hsu · Sham M Kakade · Le Song · Tong Zhang -
2011 Poster: Greedy Model Averaging »
Dong Dai · Tong Zhang -
2010 Poster: Deep Coding Network »
Yuanqing Lin · Tong Zhang · Shenghuo Zhu · Kai Yu -
2010 Poster: Agnostic Active Learning Without Constraints »
Alina Beygelzimer · Daniel Hsu · John Langford · Tong Zhang -
2010 Poster: Multi-Stage Dantzig Selector »
Ji Liu · Peter Wonka · Jieping Ye -
2009 Poster: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2009 Poster: Nonlinear Learning using Local Coordinate Coding »
Kai Yu · Tong Zhang · Yihong Gong -
2009 Oral: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2008 Poster: Adaptive Forward-Backward Greedy Algorithm for Sparse Learning with Linear Models »
Tong Zhang -
2008 Oral: Adaptive Forward-Backward Greedy Algorithm for Sparse Learning with Linear Models »
Tong Zhang -
2008 Poster: Sparse Online Learning via Truncated Gradient »
John Langford · Lihong Li · Tong Zhang -
2008 Spotlight: Sparse Online Learning via Truncated Gradient »
John Langford · Lihong Li · Tong Zhang -
2008 Poster: Multi-stage Convex Relaxation for Learning with Sparse Regularization »
Tong Zhang -
2007 Poster: A General Boosting Method and its Application to Learning Ranking Functions for Web Search »
Zhaohui Zheng · Hongyuan Zha · Tong Zhang · Olivier Chapelle · Keke Chen · Gordon Sun -
2007 Poster: The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information »
John Langford · Tong Zhang -
2006 Poster: Learning on Graph with Laplacian Regularization »
Rie Ando · Tong Zhang