Timezone: »
Poster
SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds
Yanyu Li · Huan Wang · Qing Jin · Ju Hu · Pavlo Chemerys · Yun Fu · Yanzhi Wang · Sergey Tulyakov · Jian Ren
Event URL: https://snap-research.github.io/SnapFusion/ »
Text-to-image diffusion models can create stunning images from natural language descriptions that rival the work of professional artists and photographers. However, these models are large, with complex network architectures and tens of denoising iterations, making them computationally expensive and slow to run. As a result, high-end GPUs and cloud-based inference are required to run diffusion models at scale. This is costly and has privacy implications, especially when user data is sent to a third party. To overcome these challenges, we present a generic approach that, for the first time, unlocks running text-to-image diffusion models on mobile devices in **less than 2 seconds**. We achieve so by introducing efficient network architecture and improving step distillation. Specifically, we propose an efficient UNet by identifying the redundancy of the original model and reducing the computation of the image decoder via data distillation. Further, we enhance the step distillation by exploring training strategies and introducing regularization from classifier-free guidance. Our extensive experiments on MS-COCO show that our model with $8$ denoising steps achieves better FID and CLIP scores than Stable Diffusion v$1.5$ with $50$ steps. Our work democratizes content creation by bringing powerful text-to-image diffusion models to the hands of users.
Text-to-image diffusion models can create stunning images from natural language descriptions that rival the work of professional artists and photographers. However, these models are large, with complex network architectures and tens of denoising iterations, making them computationally expensive and slow to run. As a result, high-end GPUs and cloud-based inference are required to run diffusion models at scale. This is costly and has privacy implications, especially when user data is sent to a third party. To overcome these challenges, we present a generic approach that, for the first time, unlocks running text-to-image diffusion models on mobile devices in **less than 2 seconds**. We achieve so by introducing efficient network architecture and improving step distillation. Specifically, we propose an efficient UNet by identifying the redundancy of the original model and reducing the computation of the image decoder via data distillation. Further, we enhance the step distillation by exploring training strategies and introducing regularization from classifier-free guidance. Our extensive experiments on MS-COCO show that our model with $8$ denoising steps achieves better FID and CLIP scores than Stable Diffusion v$1.5$ with $50$ steps. Our work democratizes content creation by bringing powerful text-to-image diffusion models to the hands of users.
Author Information
Yanyu Li (Northeastern University)
Huan Wang
Qing Jin (Northeastern University)
Ju Hu (Snap Inc.)
Pavlo Chemerys (Snap Inc.)
Yun Fu (Northeastern University)
Yanzhi Wang (Northeastern University)
Sergey Tulyakov (Snap Inc)
Jian Ren (Snap Inc.)
More from the Same Authors
-
2020 : Paper 20: YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design »
YUXUAN CAI · Wei Niu · Yanzhi Wang -
2021 Spotlight: MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge »
Geng Yuan · Xiaolong Ma · Wei Niu · Zhengang Li · Zhenglun Kong · Ning Liu · Yifan Gong · Zheng Zhan · Chaoyang He · Qing Jin · Siyue Wang · Minghai Qin · Bin Ren · Yanzhi Wang · Sijia Liu · Xue Lin -
2021 Spotlight: Aligned Structured Sparsity Learning for Efficient Image Super-Resolution »
Yulun Zhang · Huan Wang · Can Qin · Yun Fu -
2023 : Selective Prediction For Open-Ended Question Answering in Black-Box Vision-Language Models »
Zaid Khan · Yun Fu -
2023 Poster: UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild »
Can Qin · Shu Zhang · Ning Yu · Yihao Feng · Xinyi Yang · Yingbo Zhou · Huan Wang · Juan Carlos Niebles · Caiming Xiong · Silvio Savarese · Stefano Ermon · Yun Fu · Ran Xu -
2023 Poster: Latent Graph Inference with Limited Supervision »
Jianglin Lu · Yi Xu · Huan Wang · Yue Bai · Yun Fu -
2023 Poster: LightSpeed: Light and Fast Neural Light Fields on Mobile Devices »
Aarush Gupta · Junli Cao · Chaoyang Wang · Ju Hu · Sergey Tulyakov · Jian Ren · László Jeni -
2023 Poster: Autodecoding Latent 3D Diffusion Models »
Evangelos Ntavelis · Aliaksandr Siarohin · Kyle Olszewski · Chaoyang Wang · Luc V Gool · Sergey Tulyakov -
2023 Poster: Exploring Question Decomposition for Zero-Shot VQA »
Zaid Khan · Vijay Kumar B G · Samuel Schulter · Manmohan Chandraker · Yun Fu -
2023 Poster: HotBEV: Hardware-oriented Transformer-based Multi-View 3D Detector for BEV Perception »
Peiyan Dong · Zhenglun Kong · Xin Meng · Pinrui Yu · Yifan Gong · Geng Yuan · Hao Tang · Yanzhi Wang -
2023 Poster: PackQViT: Faster Sub-8-bit Vision Transformers via Full and Packed Quantization on the Mobile »
Peiyan Dong · LEI LU · Chao Wu · Cheng Lyu · Geng Yuan · Hao Tang · Yanzhi Wang -
2022 Spotlight: EpiGRAF: Rethinking training of 3D GANs »
Ivan Skorokhodov · Sergey Tulyakov · Yiqun Wang · Peter Wonka -
2022 Spotlight: Lightning Talks 5B-1 »
Devansh Arpit · Xiaojun Xu · Zifan Shi · Ivan Skorokhodov · Shayan Shekarforoush · Zhan Tong · Yiqun Wang · Shichong Peng · Linyi Li · Ivan Skorokhodov · Huan Wang · Yibing Song · David Lindell · Yinghao Xu · Seyed Alireza Moazenipourasil · Sergey Tulyakov · Peter Wonka · Yiqun Wang · Ke Li · David Fleet · Yujun Shen · Yingbo Zhou · Bo Li · Jue Wang · Peter Wonka · Marcus Brubaker · Caiming Xiong · Limin Wang · Deli Zhao · Qifeng Chen · Dit-Yan Yeung -
2022 Poster: EpiGRAF: Rethinking training of 3D GANs »
Ivan Skorokhodov · Sergey Tulyakov · Yiqun Wang · Peter Wonka -
2022 Poster: Look More but Care Less in Video Recognition »
Yitian Zhang · Yue Bai · Huan Wang · Yi Xu · Yun Fu -
2022 Poster: Parameter-Efficient Masking Networks »
Yue Bai · Huan Wang · Xu Ma · Yitian Zhang · Zhiqiang Tao · Yun Fu -
2022 Poster: SparCL: Sparse Continual Learning on the Edge »
Zifeng Wang · Zheng Zhan · Yifan Gong · Geng Yuan · Wei Niu · Tong Jian · Bin Ren · Stratis Ioannidis · Yanzhi Wang · Jennifer Dy -
2022 Poster: Advancing Model Pruning via Bi-level Optimization »
Yihua Zhang · Yuguang Yao · Parikshit Ram · Pu Zhao · Tianlong Chen · Mingyi Hong · Yanzhi Wang · Sijia Liu -
2022 Poster: Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training »
Geng Yuan · Yanyu Li · Sheng Li · Zhenglun Kong · Sergey Tulyakov · Xulong Tang · Yanzhi Wang · Jian Ren -
2022 Poster: EfficientFormer: Vision Transformers at MobileNet Speed »
Yanyu Li · Geng Yuan · Yang Wen · Ju Hu · Georgios Evangelidis · Sergey Tulyakov · Yanzhi Wang · Jian Ren -
2022 Poster: What Makes a "Good" Data Augmentation in Knowledge Distillation - A Statistical Perspective »
Huan Wang · Suhas Lohit · Michael Jones · Yun Fu -
2021 Poster: Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation »
Can Qin · Handong Zhao · Lichen Wang · Huan Wang · Yulun Zhang · Yun Fu -
2021 Poster: Aligned Structured Sparsity Learning for Efficient Image Super-Resolution »
Yulun Zhang · Huan Wang · Can Qin · Yun Fu -
2021 Poster: ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers »
Husheng Han · Kaidi Xu · Xing Hu · Xiaobing Chen · LING LIANG · Zidong Du · Qi Guo · Yanzhi Wang · Yunji Chen -
2021 Poster: Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot? »
Xiaolong Ma · Geng Yuan · Xuan Shen · Tianlong Chen · Xuxi Chen · Xiaohan Chen · Ning Liu · Minghai Qin · Sijia Liu · Zhangyang Wang · Yanzhi Wang -
2021 Poster: MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge »
Geng Yuan · Xiaolong Ma · Wei Niu · Zhengang Li · Zhenglun Kong · Ning Liu · Yifan Gong · Zheng Zhan · Chaoyang He · Qing Jin · Siyue Wang · Minghai Qin · Bin Ren · Yanzhi Wang · Sijia Liu · Xue Lin -
2020 Workshop: International Workshop on Scalability, Privacy, and Security in Federated Learning (SpicyFL 2020) »
Xiaolin Andy Li · Dejing Dou · Ameet Talwalkar · Hongyu Li · Jianzong Wang · Yanzhi Wang -
2020 Poster: Learning to Mutate with Hypergradient Guided Population »
Zhiqiang Tao · Yaliang Li · Bolin Ding · Ce Zhang · Jingren Zhou · Yun Fu -
2020 Poster: Neural Sparse Representation for Image Restoration »
Yuchen Fan · Jiahui Yu · Yiqun Mei · Yulun Zhang · Yun Fu · Ding Liu · Thomas Huang -
2019 Poster: PointDAN: A Multi-Scale 3D Domain Adaption Network for Point Cloud Representation »
Can Qin · Haoxuan You · Lichen Wang · C.-C. Jay Kuo · Yun Fu -
2019 Poster: First Order Motion Model for Image Animation »
Aliaksandr Siarohin · Stéphane Lathuilière · Sergey Tulyakov · Elisa Ricci · Nicu Sebe -
2017 Poster: Matching on Balanced Nonlinear Representations for Treatment Effects Estimation »
Sheng Li · Yun Fu -
2012 Poster: Fast Resampling Weighted v-Statistics »
Chunxiao Zhou · jiseong Park · Yun Fu -
2012 Spotlight: Fast Resampling Weighted v-Statistics »
Chunxiao Zhou · jiseong Park · Yun Fu