Skip to content

Latest commit

 

History

History
250 lines (127 loc) · 19 KB

TRAIN-FREE.md

File metadata and controls

250 lines (127 loc) · 19 KB

Training-free Generation Acceleration

📢 Collections of Awesome Training-free Generation Acceleration Resources.

📚 Contents

Training-free Stable Diffusion Acceleration

Base modals: Stable Diffusion, Stable Video Diffusion and Text2Video-Zero.

  • [1] Token Merging for Fast Stable Diffusion, CVPRW 2023.

    Bolya, Daniel and Hoffman, Judy.

    [Paper] [Code]

  • [2] AutoDiffusion: Training-Free Optimization of Time Steps and Architectures for Automated Diffusion Model Acceleration, ICCV 2023.

    Li, Lijiang and Li, Huixia and Zheng, Xiawu and Wu, Jie and Xiao, Xuefeng and Wang, Rui and Zheng, Min and Pan, Xin and Chao, Fei and Ji, Rongrong.

    [Paper] [Code]

  • [3] Structural Pruning for Diffusion Models, NeurIPS 2023.

    Fang, Gongfan and Ma, Xinyin and Wang, Xinchao.

    [Paper] [Code]

  • [4] Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models, NeurIPS 2024.

    Li, Senmao and Hu, Taihang and Khan, Fahad Shahbaz and Li, Linxuan and Yang, Shiqi and Wang, Yaxing and Cheng, Ming-Ming and Yang, Jian.

    [Paper] [Code]

  • [5] DeepCache: Accelerating Diffusion Models for Free, CVPR 2024.

    Ma, Xinyin and Fang, Gongfan and Wang, Xinchao.

    [Paper] [Code]

  • [6] Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models, CVPR 2024.

    Wang, Hongjie and Liu, Difan and Kang, Yan and Li, Yijun and Lin, Zhe and Jha, Niraj K and Liu, Yuchen.

    [Paper] [Code]

  • [7] PFDiff: Training-free Acceleration of Diffusion Models through the Gradient Guidance of Past and Future, arXiv 2024.

    Guangyi Wang and Yuren Cai and Lijiang Li and Wei Peng and Songzhi Su.

    [Paper] [Code]

  • [8] Token Fusion: Bridging the Gap between Token Pruning and Token Merging, WACV 2024.

    Kim, Minchul and Gao, Shangqian and Hsu, Yen-Chang and Shen, Yilin and Jin, Hongxia.

    [Paper] [Code]

  • [9] Agent Attention: On the Integration of Softmax and Linear Attention, ECCV 2024.

    Han, Dongchen and Ye, Tianzhu and Han, Yizeng and Xia, Zhuofan and Song, Shiji and Huang, Gao.

    [Paper] [Code]

  • [10] T-GATE: Temporally Gating Attention to Accelerate Diffusion Model for Free!, arXiv 2024.

    Zhang, Wentian and Liu, Haozhe and Xie, Jinheng and Faccio, Francesco and Shou, Mike Zheng and Schmidhuber, J{"u}rgen.

    [Paper] [Code]

  • [11] Faster Diffusion via Temporal Attention Decomposition, arXiv 2024.

    Liu, Haozhe and Zhang, Wentian and Xie, Jinheng and Faccio, Francesco and Xu, Mengmeng and Xiang, Tao and Shou, Mike Zheng and Perez-Rua, Juan-Manuel and Schmidhuber, J{"u}rgen}.

    [Paper] [Code]

  • [12] ToDo: Token Downsampling for Efficient Generation of High-Resolution Images, IJCAIw 2024.

    Smith, Ethan and Saxena, Nayan and Saha, Aninda.

    [Paper] [Code]

  • [13] Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Models, ECCV 2024.

    Ju, Chen and Wang, Haicheng and Li, Zeqian and Chen, Xu and Zhai, Zhonghua and Huang, Weilin and Xiao, Shuai.

    [Paper] [Code]

  • [14] F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis, AAAI 2024.

    Su, Sitong and Liu, Jianzhi and Gao, Lianli and Song, Jingkuan.

    [Paper] [Code]

  • [15] Fast and Memory-Efficient Video Diffusion Using Streamlined Inference, NeurIPS 2024.

    Zheng Zhan and Yushu Wu and Yifan Gong and Zichong Meng and Zhenglun Kong and Changdi Yang and Geng Yuan and Pu Zhao and Wei Niu and Yanzhi Wang.

    [Paper] [Code]

  • [16] Importance-based Token Merging for Diffusion Models, arXiv 2024.

    Wu, Haoyu and Xu, Jingyi and Le, Hieu and Samaras, Dimitris.

    [Paper] [Code]

  • [17] ToMA: Token Merging with Attention For Diffusion Models, OpenReview 2024.

    [Paper] [Code]

  • [18] Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis, NeurIPS 2024.

    Taihang Hu and Linxuan Li and Joost van de Weijer and Hongcheng Gao and Fahad Khan and Jian Yang and Ming-Ming Cheng and Kai Wang and Yaxing Wang.

    [Paper] [Code]

  • [19] Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion Model, arXiv 2024.

    Omid Saghatchian and Atiyeh Gh. Moghadam and Ahmad Nickabadi.

    [Paper] [Code]

  • [20] Token Pruning for Caching Better: 9 Times Acceleration on Stable Diffusion for Free, arXiv 2024.

    Evelyn Zhang and Bang Xiao and Jiayi Tang and Qianli Ma and Chang Zou and Xuefei Ning and Xuming Hu and Linfeng Zhang.

    [Paper] [Code]

  • [21] Noise Prediction Can Be Adaptively Skipped for Different Prompts Without Training, NeurIPS 2024.

    Hancheng Ye and Jiakang Yuan and Renqiu Xia and Xiangchao Yan and Tao Chen and Junchi Yan and Botian Shi and Bo Zhang.

    [Paper] [Code]

  • [22] Negative Token Merging: Image-based Adversarial Feature Guidance, arXiv 2024.

    Jaskirat Singh and Lindsey Li and Weijia Shi and Ranjay Krishna and Yejin Choi and Pang Wei Koh and Michael F. Cohen and Stephen Gould and Liang Zheng and Luke Zettlemoyer.

    [Paper] [Code]

Training-free Diffusion Transformer Acceleration

Base modals: DiT-XL for Image Generation, PIXART-α for Text2Image, Open-Sora and Open-Sora-Plan for Text2Video.

  • [1] ∆-DiT: A Training-Free Acceleration Method Tailored for Diffusion Transformers, arXiv 2024.

    Chen, Pengtao and Shen, Mingzhu and Ye, Peng and Cao, Jianjian and Tu, Chongjun and Bouganis, Christos-Savvas and Zhao, Yiren and Chen, Tao.

    [Paper] [Code]

  • [2] FORA: Fast-Forward Caching in Diffusion Transformer Acceleration, arXiv 2024.

    Selvaraju, Pratheba and Ding, Tianyu and Chen, Tianyi and Zharkov, Ilya and Liang, Luming.

    [Paper] [Code]

  • [3] DiTFastAttn: Attention Compression for Diffusion Transformer Models, NeurIPS 2024.

    Yuan, Zhihang and Lu, Pu and Zhang, Hanling and Ning, Xuefei and Zhang, Linfeng and Zhao, Tianchen and Yan, Shengen and Dai, Guohao and Wang, Yu.

    [Paper] [Code]

  • [4] Real-Time Video Generation with Pyramid Attention Broadcast, arXiv 2024.

    Xuanlei Zhao and Xiaolong Jin and Kai Wang and Yang You.

    [Paper] [Code]

  • [5] Accelerating Diffusion Transformers with Token-wise Feature Caching, arXiv 2024.

    Chang Zou and Xuyang Liu and Ting Liu and Siteng Huang and Linfeng Zhang.

    [Paper] [Code]

  • [6] FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality, arXiv 2024.

    Zhengyao Lv and Chenyang Si and Junhao Song and Zhenyu Yang and Yu Qiao and Ziwei Liu and Kwan-Yee K. Wong.

    [Paper] [Code]

  • [7] Adaptive Caching for Faster Video Generation with Diffusion Transformers, arXiv 2024.

    Kumara Kahatapitiya and Haozhe Liu and Sen He and Ding Liu and Menglin Jia and Michael S. Ryoo and Tian Xie.

    [Paper] [Code]

  • [8] HeadRouter: A Training-free Image Editing Framework for MM-DiTs by Adaptively Routing Attention Heads, arXiv 2024.

    Yu Xu and Fan Tang and Juan Cao and Yuxin Zhang and Xiaoyu Kong and Jintao Li and Oliver Deussen and Tong-Yee Lee.

    [Paper] [Code]

  • [9] Stable Flow: Vital Layers for Training-Free Image Editing, arXiv 2024.

    Omri Avrahami and Or Patashnik and Ohad Fried and Egor Nemchinov and Kfir Aberman and Dani Lischinski and Daniel Cohen-Or.

    [Paper] [Code]

  • [10] MD-DiT: Step-aware Mixture-of-Depths for Efficient Diffusion Transformers, NeurIPSw 2024.

    Mingzhu Shen, Pengtao Chen, Peng Ye, Guoxuan Xia, Tao Chen, Christos-Savvas Bouganis, Yiren Zhao.

    [Paper] [Code]

  • [11] ToMA: Token Merging with Attention For Diffusion Models, OpenReview 2024.

    [Paper] [Code]

  • [12] Accelerating Diffusion Transformers with Dual Feature Caching, arXiv 2024.

    Chang Zou and Evelyn Zhang and Runlin Guo and Haohang Xu and Conghui He and Xuming Hu and Linfeng Zhang.

    [Paper] [Code]

  • [13] Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model, arXiv 2024.

    Liu, Feng and Zhang, Shiwei and Wang, Xiaofeng and Wei, Yujie and Qiu, Haonan and Zhao, Yuzhong and Zhang, Yingya and Ye, Qixiang and Wan, Fang.

    [Paper] [Code]

Training-free Auto-Regressive Generation Acceleration

Base modals: Anole, Lumina-mGPT and VAR for Text2Image.

  • [1] Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding, arXiv 2024.

    Yao Teng and Han Shi and Xian Liu and Xuefei Ning and Guohao Dai and Yu Wang and Zhenguo Li and Xihui Liu.

    [Paper] [Code]

  • [2] ZipAR: Accelerating Autoregressive Image Generation through Spatial Locality, arXiv 2024.

    Yefei He and Feng Chen and Yuanyu He and Shaoxuan He and Hong Zhou and Kaipeng Zhang and Bohan Zhuang.

    [Paper] [Code]

  • [3] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient, arXiv 2024.

    Zigeng Chen and Xinyin Ma and Gongfan Fang and Xinchao Wang.

    [Paper] [Code]