Skip to content

Popular repositories Loading

  1. rmbg-1.4 rmbg-1.4 Public template

    State-of-the-art background removal model, designed to effectively separate foreground from background. <metadata> gpu: T4 | collections: ["HF Transformers"] </metadata>

    Python 19 10

  2. triton-co-pilot triton-co-pilot Public

    Generate Glue Code in seconds to simplify your Nvidia Triton Inference Server Deployments

    Python 19 3

  3. Smaug-72B Smaug-72B Public

    Smaug-72B - which topped the Hugging Face LLM leaderboard and it’s the first model with an average score of 80, making it the world’s best open-source foundation model.

    Python 16 5

  4. whisper-large-v3 whisper-large-v3 Public template

    State‑of‑the‑art speech recognition model for English, delivering transcription accuracy across diverse audio scenarios. <metadata> gpu: T4 | collections: ["CTranslate2"] </metadata>

    Python 15 11

  5. TensorRT-LLM TensorRT-LLM Public

    9

  6. Facebook-bart-cnn Facebook-bart-cnn Public

    BART model pre-trained on English language, and fine-tuned on CNN Daily Mail. It was introduced in the paper BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Trans…

    Python 8 2

Repositories

Showing 10 of 149 repositories
  • stable-diffusion-xl-turbo Public template

    A distilled and cost-effective variant of SDXL that delivers high-quality text-to-image generation with accelerated inference speed. <metadata> gpu: T4 | collections: ["Diffusers"] </metadata>

    inferless/stable-diffusion-xl-turbo’s past year of commit activity
    Python 3 10 0 0 Updated Feb 21, 2025
  • DeciLM-7B Public
    inferless/DeciLM-7B’s past year of commit activity
    Python 0 1 0 0 Updated Feb 21, 2025
  • qwq-32b-preview Public template

    A 32B experimental reasoning model for advanced text generation and robust instruction following. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>

    inferless/qwq-32b-preview’s past year of commit activity
    Python 2 2 0 0 Updated Feb 20, 2025
  • whisper-large-v3-turbo Public template

    A turbocharged variant of Whisper large‑v3 for English speech recognition, optimized for lower latency. <metadata> gpu: T4 | collections: ["HF Transformers","Complex Outputs"] </metadata>

    inferless/whisper-large-v3-turbo’s past year of commit activity
    Python 0 4 0 0 Updated Feb 20, 2025
  • mistral-small-24b-instruct Public template

    24B instruction-tuned model, delivering context-aware, reliable responses optimized for performance and efficiency. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>

    inferless/mistral-small-24b-instruct’s past year of commit activity
    Python 0 0 0 0 Updated Feb 19, 2025
  • llama-3.2-3b-instruct Public template

    3B compact instruction-tuned model generate detailed responses across a range of tasks. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>

    inferless/llama-3.2-3b-instruct’s past year of commit activity
    Python 0 2 0 0 Updated Feb 18, 2025
  • qwen2.5-vl-7b-instruct Public template

    Vision-Language model that integrates advanced image, video, and text understanding. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>

    inferless/qwen2.5-vl-7b-instruct’s past year of commit activity
    Python 0 0 0 0 Updated Feb 15, 2025
  • mistral-7b-instruct-v0.3 Public template

    7B model fine-tuned for precise instruction following and robust contextual understanding. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>

    inferless/mistral-7b-instruct-v0.3’s past year of commit activity
    Python 0 0 0 0 Updated Feb 15, 2025
  • llama-2-7b-gptq Public template

    A 7B conversational model fine-tuned with RLHF, deployable efficiently via vLLM for low-latency serving. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>

    inferless/llama-2-7b-gptq’s past year of commit activity
    Python 0 12 0 0 Updated Feb 15, 2025
  • llama-2-7b-hf Public template

    A 7B parameter model fine-tuned for dialogue, utilizing supervised learning and RLHF, supports a context length of up to 4,000 tokens. <metadata> gpu: A10 | collections: ["HF Transformers"] </metadata>

    inferless/llama-2-7b-hf’s past year of commit activity
    Python 1 3 0 0 Updated Feb 15, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…