Timezone: »
This paper introduces channel gating, a dynamic, fine-grained, and hardware-efficient pruning scheme to reduce the computation cost for convolutional neural networks (CNNs). Channel gating identifies regions in the features that contribute less to the classification result, and skips the computation on a subset of the input channels for these ineffective regions. Unlike static network pruning, channel gating optimizes CNN inference at run-time by exploiting input-specific characteristics, which allows substantially reducing the compute cost with almost no accuracy loss. We experimentally show that applying channel gating in state-of-the-art networks achieves 2.7-8.0x reduction in floating-point operations (FLOPs) and 2.0-4.4x reduction in off-chip memory accesses with a minimal accuracy loss on CIFAR-10. Combining our method with knowledge distillation reduces the compute cost of ResNet-18 by 2.6x without accuracy drop on ImageNet. We further demonstrate that channel gating can be realized in hardware efficiently. Our approach exhibits sparsity patterns that are well-suited to dense systolic arrays with minimal additional hardware. We have designed an accelerator for channel gating networks, which can be implemented using either FPGAs or ASICs. Running a quantized ResNet-18 model for ImageNet, our accelerator achieves an encouraging speedup of 2.4x on average, with a theoretical FLOP reduction of 2.8x.
Author Information
Weizhe Hua (Cornell University)
Yuan Zhou (Cornell)
Christopher De Sa (Cornell)
Zhiru Zhang (Cornell Univeristy)
G. Edward Suh (Cornell University)
More from the Same Authors
-
2022 Poster: Understanding Hyperdimensional Computing for Parallel Single-Pass Learning »
Tao Yu · Yichi Zhang · Zhiru Zhang · Christopher De Sa -
2021 Poster: Representing Hyperbolic Space Accurately using Multi-Component Floats »
Tao Yu · Christopher De Sa -
2021 Poster: Hyperparameter Optimization Is Deceiving Us, and How to Stop It »
A. Feder Cooper · Yucheng Lu · Jessica Forde · Christopher De Sa -
2021 Poster: Equivariant Manifold Flows »
Isay Katsman · Aaron Lou · Derek Lim · Qingxuan Jiang · Ser Nam Lim · Christopher De Sa -
2021 Poster: BulletTrain: Accelerating Robust Neural Network Training via Boundary Example Mining »
Weizhe Hua · Yichi Zhang · Chuan Guo · Zhiru Zhang · G. Edward Suh -
2020 Workshop: Differential Geometry meets Deep Learning (DiffGeo4DL) »
Joey Bose · Emile Mathieu · Charline Le Lan · Ines Chami · Frederic Sala · Christopher De Sa · Maximilian Nickel · Christopher Ré · Will Hamilton -
2020 Poster: Random Reshuffling is Not Always Better »
Christopher De Sa -
2020 Poster: Asymptotically Optimal Exact Minibatch Metropolis-Hastings »
Ruqi Zhang · A. Feder Cooper · Christopher De Sa -
2020 Spotlight: Asymptotically Optimal Exact Minibatch Metropolis-Hastings »
Ruqi Zhang · A. Feder Cooper · Christopher De Sa -
2020 Spotlight: Random Reshuffling is Not Always Better »
Christopher De Sa -
2020 Poster: Neural Manifold Ordinary Differential Equations »
Aaron Lou · Derek Lim · Isay Katsman · Leo Huang · Qingxuan Jiang · Ser Nam Lim · Christopher De Sa -
2019 : Algorithm-Accelerator Co-Design for Neural Network Specialization »
Zhiru Zhang -
2019 Poster: Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models »
Tao Yu · Christopher De Sa -
2019 Spotlight: Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models »
Tao Yu · Christopher De Sa -
2019 Poster: Dimension-Free Bounds for Low-Precision Training »
Zheng Li · Christopher De Sa -
2019 Poster: Poisson-Minibatching for Gibbs Sampling with Convergence Rate Guarantees »
Ruqi Zhang · Christopher De Sa -
2019 Spotlight: Poisson-Minibatching for Gibbs Sampling with Convergence Rate Guarantees »
Ruqi Zhang · Christopher De Sa