Skip to content

Latest commit

 

History

History
603 lines (309 loc) · 42.5 KB

README.md

File metadata and controls

603 lines (309 loc) · 42.5 KB

🎨 Awesome Generation Acceleration 🚀

Awesome papercount Maintenance Last Commit GitHub
👐👐 If you would like to contribute to this repository, feel free to email me at liuxuyang@stu.scu.edu.cn! 👐👐

🔥 News

  • 2025/02/22 💥💥 Our work ToCa has been accepted by ICLR 2025! Congratulations to all collaborators!

  • 2024/12/24 🤗🤗 We release an open-sourse repo "Awesome-Token-Reduction-for-Model-Compression", which collects recent awesome token reduction papers! Feel free to contribute your suggestions!

  • 2024/10/12 🚀🚀 We release our work ToCa about accelerating DiT models for FREE, which achieves nearly lossless acceleration of 1.51× on FLUX, 1.93× on PixArt-α, and 2.36× on OpenSora! Code is now available!

  • 2024/07/15 🤗🤗 We release an open-sourse repo "Awesome-Generation-Acceleration", which collects recent awesome generation accleration papers! Feel free to contribute your suggestions!

📚 Contents

💬 Keywords

📝 Papers

Fast Sampling

  • [1] Denoising Diffusion Implicit Models, ICLR 2021.

    Song, Jiaming and Meng, Chenlin and Ermon, Stefano.

    [Paper] [Code]

  • [2] DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps, NeurIPS 2022.

    Lu, Cheng and Zhou, Yuhao and Bao, Fan and Chen, Jianfei and Li, Chongxuan and Zhu, Jun.

    [Paper] [Code]

  • [3] DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models, arXiv 2022.

    Lu, Cheng and Zhou, Yuhao and Bao, Fan and Chen, Jianfei and Li, Chongxuan and Zhu, Jun.

    [Paper] [Code]

  • [4] Jump Your Steps: Optimizing Sampling Schedule of Discrete Diffusion Models, arXiv 2022.

    Yong-Hyun Park and Chieh-Hsin Lai and Satoshi Hayakawa and Yuhta Takida and Yuki Mitsufuji.

    [Paper] [Code]

  • [5] AdaDiff: Adaptive Step Selection for Fast Diffusion, arXiv 2023.

    Hui Zhang, Zuxuan Wu, Zhen Xing, Jie Shao, Yu-Gang Jiang.

    [Paper] [Code]

  • [6] DuoDiff: Accelerating Diffusion Models with a Dual-Backbone Approach, arXiv 2024.

    Daniel Gallo Fernández, Rǎzvan-Andrei Matişan, Alejandro Monroy Muñoz, Ana-Maria Vasilcoiu, Janusz Partyka, Tin Hadži Veljković, Metod Jazbec.

    [Paper] [Code]

  • [7] SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion, arXiv 2024.

    Trong-Tung Nguyen and Quang Nguyen and Khoi Nguyen and Anh Tran and Cuong Pham.

    [Paper] [Code]

  • [8] Accelerate High-Quality Diffusion Models with Inner Loop Feedback, arXiv 2025.

    Matthew Gwilliam and Han Cai and Di Wu and Abhinav Shrivastava and Zhiyu Cheng.

    [Paper] [Code]

Pruning

  • [1] Token Merging for Fast Stable Diffusion, CVPRW 2023.

    Bolya, Daniel and Hoffman, Judy.

    [Paper] [Code]

  • [2] Structural Pruning for Diffusion Models, NeurIPS 2023.

    Fang, Gongfan and Ma, Xinyin and Wang, Xinchao.

    [Paper] [Code]

  • [3] Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models, CVPR 2024.

    Wang, Hongjie and Liu, Difan and Kang, Yan and Li, Yijun and Lin, Zhe and Jha, Niraj K and Liu, Yuchen.

    [Paper] [Code]

  • [4] LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models, arXiv 2024.

    Zhang, Dingkun and Li, Sijia and Chen, Chen and Xie, Qingsong and Lu, Haonan.

    [Paper] [Code]

  • [5] SparseDM: Toward Sparse Efficient Diffusion Models, arXiv 2024.

    Wang, Kafeng and Chen, Jianfei and Li, He and Mi, Zhenpeng and Zhu, Jun.

    [Paper] [Code]

  • [6] Token Fusion: Bridging the Gap between Token Pruning and Token Merging, WACV 2024.

    Kim, Minchul and Gao, Shangqian and Hsu, Yen-Chang and Shen, Yilin and Jin, Hongxia.

    [Paper] [Code]

  • [7] Token Caching for Diffusion Transformer Acceleration, arXiv 2024.

    Jinming Lou and Wenyang Luo and Yufan Liu and Bing Li and Xinmiao Ding and Weiming Hu and Jiajiong Cao and Yuming Li and Chenguang Ma.

    [Paper] [Code]

  • [8] Dynamic Diffusion Transformer, ICLR 2025.

    Wangbo Zhao and Yizeng Han and Jiasheng Tang and Kai Wang and Yibing Song and Gao Huang and Fan Wang and Yang You.

    [Paper] [Code]

  • [9] ToDo: Token Downsampling for Efficient Generation of High-Resolution Images, IJCAIw 2024.

    Smith, Ethan and Saxena, Nayan and Saha, Aninda.

    [Paper] [Code]

  • [10] Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Models, ECCV 2024.

    Ju, Chen and Wang, Haicheng and Li, Zeqian and Chen, Xu and Zhai, Zhonghua and Huang, Weilin and Xiao, Shuai.

    [Paper] [Code]

  • [11] F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis, AAAI 2024.

    Su, Sitong and Liu, Jianzhi and Gao, Lianli and Song, Jingkuan.

    [Paper] [Code]

  • [12] DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization, NeurIPS 2024.

    Zhu, Haowei and Tang, Dehua and Liu, Ji and Lu, Mingjie and Zheng, Jintu and Peng, Jinzhang and Li, Dong and Wang, Yu and Jiang, Fan and Tian, Lu and others.

    [Paper] [Code]

  • [13] U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers, NeurIPS 2024.

    Yuchuan Tian and Zhijun Tu and Hanting Chen and Jie Hu and Chao Xu and Yunhe Wang.

    [Paper] [Code]

  • [14] Importance-based Token Merging for Diffusion Models, arXiv 2024.

    Wu, Haoyu and Xu, Jingyi and Le, Hieu and Samaras, Dimitris.

    [Paper] [Code]

  • [15] HeadRouter: A Training-free Image Editing Framework for MM-DiTs by Adaptively Routing Attention Heads, arXiv 2024.

    Yu Xu and Fan Tang and Juan Cao and Yuxin Zhang and Xiaoyu Kong and Jintao Li and Oliver Deussen and Tong-Yee Lee.

    [Paper] [Code]

  • [16] Stable Flow: Vital Layers for Training-Free Image Editing, arXiv 2024.

    Omri Avrahami and Or Patashnik and Ohad Fried and Egor Nemchinov and Kfir Aberman and Dani Lischinski and Daniel Cohen-Or.

    [Paper] [Code]

  • [17] TinyFusion: Diffusion Transformers Learned Shallow, arXiv 2024.

    Fang, Gongfan and Li, Kunjun and Ma, Xinyin and Wang, Xinchao.

    [Paper] [Code]

  • [18] Diffusion Model Compression for Image-to-Image Translation, ACCV 2024.

    Kim, Geonung and Kim, Beomsu and Park, Eunhyeok and Cho, Sunghyun.

    [Paper] [Code]

  • [19] FlexDiT: Dynamic Token Density Control for Diffusion Transformer, arXiv 2024.

    Shuning Chang and Pichao Wang and Jiasheng Tang and Yi Yang.

    [Paper] [Code]

  • [20] AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration, arXiv 2024.

    AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration.

    [Paper] [Code]

  • [21] ToMA: Token Merging with Attention For Diffusion Models, OpenReview 2024.

    [Paper] [Code]

  • [22] KOALA: Empirical Lessons Toward Memory-Efficient and Fast Diffusion Models for Text-to-Image Synthesis, NeurIPS 2024.

    Youngwan Lee, Kwanyong Park, Yoorhim Cho, Yong-Ju Lee, Sung Ju Hwang.

    [Paper] [Code]

  • [23] Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis, NeurIPS 2024.

    Taihang Hu and Linxuan Li and Joost van de Weijer and Hongcheng Gao and Fahad Khan and Jian Yang and Ming-Ming Cheng and Kai Wang and Yaxing Wang.

    [Paper] [Code]

  • [24] Layer- and Timestep-Adaptive Differentiable Token Compression Ratios for Efficient Diffusion Transformers, arXiv 2024.

    Haoran You and Connelly Barnes and Yuqian Zhou and Yan Kang and Zhenbang Du and Wei Zhou and Lingzhi Zhang and Yotam Nitzan and Xiaoyang Liu and Zhe Lin and Eli Shechtman and Sohrab Amirghodsi and Yingyan Celine Lin.

    [Paper] [Code]

  • [25] Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion Model, arXiv 2024.

    Omid Saghatchian and Atiyeh Gh. Moghadam and Ahmad Nickabadi.

    [Paper] [Code]

  • [26] Token Pruning for Caching Better: 9 Times Acceleration on Stable Diffusion for Free, arXiv 2024.

    Evelyn Zhang and Bang Xiao and Jiayi Tang and Qianli Ma and Chang Zou and Xuefei Ning and Xuming Hu and Linfeng Zhang.

    [Paper] [Code]

  • [27] Pruning for Sparse Diffusion Models Based on Gradient Flow, ICASSP 2025.

    Ben Wan and Tianyi Zheng and Zhaoyu Chen and Yuxiao Wang and Jia Wang.

    [Paper] [Code]

  • [28] Negative Token Merging: Image-based Adversarial Feature Guidance, arXiv 2024.

    Jaskirat Singh and Lindsey Li and Weijia Shi and Ranjay Krishna and Yejin Choi and Pang Wei Koh and Michael F. Cohen and Stephen Gould and Liang Zheng and Luke Zettlemoyer.

    [Paper] [Code]

Quantization

  • [1] Post-training Quantization on Diffusion Models, CVPR 2023.

    Shang, Yuzhang and Yuan, Zhihang and Xie, Bin and Wu, Bingzhe and Yan, Yan.

    [Paper] [Code]

  • [2] Temporal Dynamic Quantization for Diffusion Models, NeurIPS 2023.

    So, Junhyuk and Lee, Jungwon and Ahn, Daehyun and Kim, Hyungjun and Park, Eunhyeok.

    [Paper] [Code]

  • [3] QVD: Post-training Quantization for Video Diffusion Models, arXiv 2024.

    Tian, Shilong and Chen, Hong and Lv, Chengtao and Liu, Yu and Guo, Jinyang and Liu, Xianglong and Li, Shengxi and Yang, Hao and Xie, Tao.

    [Paper] [Code]

  • [4] VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers, arXiv 2024.

    Deng, Juncan and Li, Shuaiting and Wang, Zeyu and Gu, Hong and Xu, Kedong and Huang, Kejie.

    [Paper] [Code]

  • [5] DiTAS: Quantizing Diffusion Transformers via Enhanced Activation Smoothing, arXiv 2024.

    Dong, Zhenyuan and Zhang, Sai Qian.

    [Paper] [Code]

  • [6] Q-dit: Accurate post-training quantization for diffusion transformers, arXiv 2024.

    Chen, Lei and Meng, Yuan and Tang, Chen and Ma, Xinzhu and Jiang, Jingyan and Wang, Xin and Wang, Zhi and Zhu, Wenwu.

    [Paper] [Code]

  • [7] SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models, arXiv 2024.

    Muyang Li and Yujun Lin and Zhekai Zhang and Tianle Cai and Xiuyu Li and Junxian Guo and Enze Xie and Chenlin Meng and Jun-Yan Zhu and Song Han.

    [Paper] [Code]

  • [8] SQ-DM: Accelerating Diffusion Models with Aggressive Quantization and Temporal Sparsity, arXiv 2025.

Zichen Fan and Steve Dai and Rangharajan Venkatesan and Dennis Sylvester and Brucek Khailany.

[Paper] [Code]

Distillation

  • [1] Progressive Distillation for Fast Sampling of Diffusion Models, ICLR 2022.

    Salimans, Tim and Ho, Jonathan.

    [Paper] [Code]

  • [2] SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds, NeurIPS 2023.

    Li, Yanyu and Wang, Huan and Jin, Qing and Hu, Ju and Chemerys, Pavlo and Fu, Yun and Wang, Yanzhi and Tulyakov, Sergey and Ren, Jian.

    [Paper] [Code]

  • [3] BK-SDM: A Lightweight, Fast, and Cheap Version of Stable Diffusion, ECCV 2024.

    Kim, Bo-Kyeong and Song, Hyoung-Kyu and Castells, Thibault and Choi, Shinkook.

    [Paper] [Code]

  • [4] Accelerating Diffusion Models with One-to-Many Knowledge Distillation, arXiv 2024.

    Linfeng Zhang and Kaisheng Ma.

    [Paper] [Code]

  • [5] Relational Diffusion Distillation for Efficient Image Generation, ACM MM 2024.

    Weilun Feng and Chuanguang Yang and Zhulin An and Libo Huang and Boyu Diao and Fei Wang and Yongjun Xu.

    [Paper] [Code]

  • [6] Accelerating Video Diffusion Models via Distribution Matching, arXiv 2024.

    Yuanzhi Zhu and Hanshu Yan and Huan Yang and Kai Zhang and Junnan Li.

    [Paper] [Code]

  • [7] From Slow Bidirectional to Fast Causal Video Generators, arXiv 2024.

    Yin, Tianwei and Zhang, Qiang and Zhang, Richard and Freeman, William T and Durand, Fredo and Shechtman, Eli and Huang, Xun.

    [Paper] [Code]

  • [8] SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training, arXiv 2024.

    Dongting Hu and Jierun Chen and Xijie Huang and Huseyin Coskun and Arpit Sahni and Aarush Gupta and Anujraaj Goyal and Dishani Lahiri and Rajesh Singh and Yerlan Idelbayev and Junli Cao and Yanyu Li and Kwang-Ting Cheng and S. -H. Gary Chan and Mingming Gong and Sergey Tulyakov and Anil Kag and Yanwu Xu and Jian Ren.

    [Paper] [Code]

  • [9] Inference-Time Diffusion Model Distillation, arXiv 2024.

    Geon Yeong Park and Sang Wan Lee and Jong Chul Ye.

    [Paper] [Code]

Cache Mechanism

  • [1] Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models, NeurIPS 2024.

    Li, Senmao and Hu, Taihang and Khan, Fahad Shahbaz and Li, Linxuan and Yang, Shiqi and Wang, Yaxing and Cheng, Ming-Ming and Yang, Jian.

    [Paper] Code]

  • [2] DeepCache: Accelerating Diffusion Models for Free, CVPR 2024.

    Ma, Xinyin and Fang, Gongfan and Wang, Xinchao.

    [Paper] [Code]

  • [3] ∆-DiT: A Training-Free Acceleration Method Tailored for Diffusion Transformers, arXiv 2024.

    Chen, Pengtao and Shen, Mingzhu and Ye, Peng and Cao, Jianjian and Tu, Chongjun and Bouganis, Christos-Savvas and Zhao, Yiren and Chen, Tao.

    [Paper] [Code]

  • [4] FORA: Fast-Forward Caching in Diffusion Transformer Acceleration, arXiv 2024.

    Selvaraju, Pratheba and Ding, Tianyu and Chen, Tianyi and Zharkov, Ilya and Liang, Luming.

    [Paper] [Code]

  • [5] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching, NeurIPS 2024.

    Ma, Xinyin and Fang, Gongfan and Mi, Michael Bi and Wang, Xinchao.

    [Paper] [Code]

  • [6] Cache Me if You Can: Accelerating Diffusion Models through Block Caching, CVPR 2024.

    Wimbauer, Felix and Wu, Bichen and Schoenfeld, Edgar and Dai, Xiaoliang and Hou, Ji and He, Zijian and Sanakoyeu, Artsiom and Zhang, Peizhao and Tsai, Sam and Kohler, Jonas and others.

    [Paper] [Code]

  • [7] Token Caching for Diffusion Transformer Acceleration, arXiv 2024.

    Jinming Lou and Wenyang Luo and Yufan Liu and Bing Li and Xinmiao Ding and Weiming Hu and Jiajiong Cao and Yuming Li and Chenguang Ma.

    [Paper] [Code]

  • [8] HarmoniCa: Harmonizing Training and Inference for Better Feature Cache in Diffusion Transformer Acceleration, arXiv 2024.

    Yushi Huang and Zining Wang and Ruihao Gong and Jing Liu and Xinjie Zhang and Jun Zhang.

    [Paper] [Code]

  • [9] Accelerating Diffusion Transformers with Token-wise Feature Caching, arXiv 2024.

    Chang Zou and Xuyang Liu and Ting Liu and Siteng Huang and Linfeng Zhang.

    [Paper] [Code]

  • [10] FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality, arXiv 2024.

    Zhengyao Lv and Chenyang Si and Junhao Song and Zhenyu Yang and Yu Qiao and Ziwei Liu and Kwan-Yee K. Wong.

    [Paper] [Code]

  • [11] Adaptive Caching for Faster Video Generation with Diffusion Transformers, arXiv 2024.

    Kumara Kahatapitiya and Haozhe Liu and Sen He and Ding Liu and Menglin Jia and Michael S. Ryoo and Tian Xie.

    [Paper] [Code]

  • [12] Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing, arXiv 2024.

    Kaifeng Gao and Jiaxin Shi and Hanwang Zhang and Chunping Wang and Jun Xiao and Long Chen.

    [Paper] [Code]

  • [13] Accelerating Vision Diffusion Transformers with Skip Branches, arXiv 2024.

    Guanjie Chen and Xinyu Zhao and Yucheng Zhou and Tianlong Chen and Cheng Yu.

    [Paper] [Code]

  • [14] MD-DiT: Step-aware Mixture-of-Depths for Efficient Diffusion Transformers, NeurIPSw 2024.

    Mingzhu Shen, Pengtao Chen, Peng Ye, Guoxuan Xia, Tao Chen, Christos-Savvas Bouganis, Yiren Zhao.

    [Paper] [Code]

  • [15] E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modelings, arXiv 2024.

    Zhihang Yuan and Yuzhang Shang and Hanling Zhang and Tongcheng Fang and Rui Xie and Bingxin Xu and Yan Yan and Shengen Yan and Guohao Dai and Yu Wang.

    [Paper] [Code]

  • [16] Accelerating Diffusion Transformers with Dual Feature Caching, arXiv 2024.

    Chang Zou and Evelyn Zhang and Runlin Guo and Haohang Xu and Conghui He and Xuming Hu and Linfeng Zhang.

    [Paper] [Code]

  • [17] Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion Model, arXiv 2024.

    Omid Saghatchian and Atiyeh Gh. Moghadam and Ahmad Nickabadi.

    [Paper] [Code]

  • [18] Token Pruning for Caching Better: 9 Times Acceleration on Stable Diffusion for Free, arXiv 2024.

    Evelyn Zhang and Bang Xiao and Jiayi Tang and Qianli Ma and Chang Zou and Xuefei Ning and Xuming Hu and Linfeng Zhang.

    [Paper] [Code]

  • [19] Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model, arXiv 2024.

    Liu, Feng and Zhang, Shiwei and Wang, Xiaofeng and Wei, Yujie and Qiu, Haonan and Zhao, Yuzhong and Zhang, Yingya and Ye, Qixiang and Wan, Fang.

    [Paper] [Code]

  • [20] Noise Prediction Can Be Adaptively Skipped for Different Prompts Without Training, NeurIPS 2024.

    Hancheng Ye and Jiakang Yuan and Renqiu Xia and Xiangchao Yan and Tao Chen and Junchi Yan and Botian Shi and Bo Zhang.

    [Paper] [Code]

Dynamic Neural Networks

  • [1] Dynamic Diffusion Transformer, ICLR 2025.

    Wangbo Zhao and Yizeng Han and Jiasheng Tang and Kai Wang and Yibing Song and Gao Huang and Fan Wang and Yang You.

    [Paper] [Code]

  • [2] HeadRouter: A Training-free Image Editing Framework for MM-DiTs by Adaptively Routing Attention Heads, arXiv 2024.

    Yu Xu and Fan Tang and Juan Cao and Yuxin Zhang and Xiaoyu Kong and Jintao Li and Oliver Deussen and Tong-Yee Lee.

    [Paper] [Code]

  • [3] DiffiT: Diffusion Vision Transformers for Image Generation, ECCV 2024.

    Hatamizadeh, Ali and Song, Jiaming and Liu, Guilin and Kautz, Jan and Vahdat, Arash.

    [Paper] [Code] !

  • [4] Mixture of Efficient Diffusion Experts Through Automatic Interval and Sub-Network Selection, ECCV 2024.

    Alireza Ganjdanesh and Yan Kang and Yuchen Liu and Richard Zhang and Zhe Lin and Heng Huang.

    [Paper] [Code] ! !

Deployment Optimization

  • [1] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models, CVPR 2024 Highlight.

    Li, Muyang and Cai, Tianle and Cao, Jiaxin and Zhang, Qinsheng and Cai, Han and Bai, Junjie and Jia, Yangqing and Li, Kai and Han, Song.

    [Paper] [Code]

  • [2] PipeFusion: Displaced Patch Pipeline Parallelism for Inference of Diffusion Transformer Models, arXiv 2024.

    Wang, Jiannan and Fang, Jiarui and Li, Aoyu and Yang, PengCheng.

    [Paper] [Code]

  • [3] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising, NeurIPS 2024.

    Chen, Zigeng and Ma, Xinyin and Fang, Gongfan and Tan, Zhenxiong and Wang, Xinchao.

    [Paper] [Code]

  • [4] Fast and Memory-Efficient Video Diffusion Using Streamlined Inference, NeurIPS 2024.

    Zheng Zhan and Yushu Wu and Yifan Gong and Zichong Meng and Zhenglun Kong and Changdi Yang and Geng Yuan and Pu Zhao and Wei Niu and Yanzhi Wang.

    [Paper] [Code]

  • [5] Partially Conditioned Patch Parallelism for Accelerated Diffusion Model Inference, arXiv 2024.

    XiuYu Zhang and Zening Luo and Michelle E. Lu.

    [Paper] [Code]

📌 Citation

Please consider giving a star ⭐ and citing 📝 this repository if you find it useful.

@misc{xuyang2024acceleration,
  author={Xuyang Liu},
  title = {Awesome-Generation-Acceleration},
  year = {2024},
  url = {https://github.com/xuyang-liu16/Awesome-Generation-Acceleration},
}

💻 Related Works

  • Awesome Token Reduction for Model Compression: An open-source repository that curates a collection of recent awesome papers on token reduction for model compression.
  • ToCa: A training-free acceleration method achieving nearly lossless acceleration of 1.51× on FLUX, 1.93× on PixArt-α, and 2.36× on OpenSora!
  • GlobalCom2: A "global-to-local" approach for training-free acceleration of high-resolution MLLMs with AnyRes strategy.
  • FiCoCo: A systematic study that proposes a unified "filter-correlate-compress" paradigm for training-free token reduction in MLLMs.

Stars Trends

Star History Chart

🧑‍💻 Contributors

👏 Thanks to these contributors for this excellent work!