Timezone: »
Poster
Theoretical Limits of Pipeline Parallel Optimization and Application to Distributed Deep Learning
Igor Colin · Ludovic DOS SANTOS · Kevin Scaman
Thu Dec 12 05:00 PM -- 07:00 PM (PST) @ East Exhibition Hall B + C #206
We investigate the theoretical limits of pipeline parallel learning of deep learning architectures, a distributed setup in which the computation is distributed per layer instead of per example. For smooth convex and non-convex objective functions, we provide matching lower and upper complexity bounds and show that a naive pipeline parallelization of Nesterov's accelerated gradient descent is optimal. For non-smooth convex functions, we provide a novel algorithm coined Pipeline Parallel Random Smoothing (PPRS) that is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension. While the convergence rate still obeys a slow $\varepsilon^{-2}$ convergence rate, the depth-dependent part is accelerated, resulting in a near-linear speed-up and convergence time that only slightly depends on the depth of the deep learning architecture. Finally, we perform an empirical analysis of the non-smooth non-convex case and show that, for difficult and highly non-smooth problems, PPRS outperforms more traditional optimization algorithms such as gradient descent and Nesterov's accelerated gradient descent for problems where the sample size is limited, such as few-shot or adversarial learning.
Author Information
Igor Colin (Huawei)
Ludovic DOS SANTOS (Huawei)
Kevin Scaman (Huawei Noah's Ark Lab)
More from the Same Authors
-
2022 Poster: An $\alpha$-No-Regret Algorithm For Graphical Bilinear Bandits »
Geovani Rizk · Igor Colin · Albert Thomas · Rida Laraki · Yann Chevaleyre -
2021 Poster: Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize »
Alain Durmus · Eric Moulines · Alexey Naumov · Sergey Samsonov · Kevin Scaman · Hoi-To Wai -
2020 Poster: A Simple and Efficient Smoothing Method for Faster Optimization and Local Exploration »
Kevin Scaman · Ludovic DOS SANTOS · Merwan Barlier · Igor Colin -
2018 Poster: Optimal Algorithms for Non-Smooth Distributed Optimization in Networks »
Kevin Scaman · Francis Bach · Sebastien Bubeck · Laurent Massoulié · Yin Tat Lee -
2018 Oral: Optimal Algorithms for Non-Smooth Distributed Optimization in Networks »
Kevin Scaman · Francis Bach · Sebastien Bubeck · Laurent Massoulié · Yin Tat Lee -
2018 Poster: Lipschitz regularity of deep neural networks: analysis and efficient estimation »
Aladin Virmaux · Kevin Scaman -
2015 Poster: Extending Gossip Algorithms to Distributed Estimation of U-statistics »
Igor Colin · Aurélien Bellet · Joseph Salmon · Stéphan Clémençon -
2015 Spotlight: Extending Gossip Algorithms to Distributed Estimation of U-statistics »
Igor Colin · Aurélien Bellet · Joseph Salmon · Stéphan Clémençon