🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.
-
Updated
May 14, 2025 - Python
🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.
☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
Amazon SageMaker Llama 2 Inference via Response Streaming
This repo introduces MagicData-CLAM, a Chinese SFT dataset, and provides to the community two relevant models that we finetuned. Contact business@magicdatatech.com for more information.
GUI version of text-generation-inference
This project demonstrates the process of fine-tuning the Qwen2.5-3B-Instruct model using GRPO (Generalized Reward Policy Optimization) on the GSM8K dataset.
RisuAI backend with python only. TextGen works, need more memory related updates
Deploy the Defog sqlcoder2 llm on Modal (https://modal.com) using Hugging Face Text Generation Inference (TGI)
Serve the AI Singapore SEA-LION model ⚛ with TGI
This project demonstrates the process of fine-tuning the Qwen2.5-3B-Instruct model using GRPO (Generalized Reward Policy Optimization) on the GSM8K dataset.
Text Generation Interference example in Windows (docker, WSL is needed)
Add a description, image, and links to the text-generation-inference topic page so that developers can more easily learn about it.
To associate your repository with the text-generation-inference topic, visit your repo's landing page and select "manage topics."