A library for compressing large language models utilizing the latest techniques and research in the field for both training aware and post training techniques. The library is designed to be flexible and easy to use on top of PyTorch and HuggingFace Transformers, allowing for quick experimentation.
-
Notifications
You must be signed in to change notification settings - Fork 86
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
License
vllm-project/llm-compressor
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
Topics
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published