Skip to content
Change the repository type filter

All

    Repositories list

    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      5.2k7025Updated Jan 15, 2025Jan 15, 2025
    • A safetensors extension to efficiently store sparse quantized tensors on disk
      Python
      Apache License 2.0
      66328Updated Jan 14, 2025Jan 14, 2025
    • General Information, model certifications, and benchmarks for nm-vllm enterprise distributions
      1910Updated Jan 14, 2025Jan 14, 2025
    • Fast and memory-efficient exact attention
      C++
      BSD 3-Clause "New" or "Revised" License
      1.4k000Updated Jan 13, 2025Jan 13, 2025
    • Pytest plugin used by the Release Engineering team
      Python
      Apache License 2.0
      0000Updated Jan 8, 2025Jan 8, 2025
    • yolov5

      Public
      YOLOv5 in PyTorch > ONNX > CoreML > TFLite
      Python
      GNU General Public License v3.0
      17k2002Updated Dec 23, 2024Dec 23, 2024
    • 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
      Python
      Apache License 2.0
      28k101Updated Dec 23, 2024Dec 23, 2024
    • axolotl

      Public
      Go ahead and axolotl questions
      Python
      Apache License 2.0
      918001Updated Dec 20, 2024Dec 20, 2024
    • Neural Magic GHA
      Python
      Apache License 2.0
      0002Updated Dec 18, 2024Dec 18, 2024
    • guidellm

      Public
      Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
      Python
      Apache License 2.0
      15183109Updated Dec 11, 2024Dec 11, 2024
    • Benchmarking code for running quantized kernels from vLLM and other libraries
      Python
      0510Updated Dec 3, 2024Dec 3, 2024
    • A framework for few-shot evaluation of language models.
      Python
      MIT License
      2k301Updated Nov 27, 2024Nov 27, 2024
    • docs

      Public
      Top-level directory for documentation and general content
      MDX
      712004Updated Nov 25, 2024Nov 25, 2024
    • Fast and memory-efficient exact attention
      C++
      BSD 3-Clause "New" or "Revised" License
      1.4k000Updated Nov 23, 2024Nov 23, 2024
    • Python
      4000Updated Nov 21, 2024Nov 21, 2024
    • evalplus

      Public
      NeuralMagic fork of EvalPlus (Rigourous evaluation of LLM-synthesized code - NeurIPS 2023)
      Python
      Apache License 2.0
      116000Updated Nov 21, 2024Nov 21, 2024
    • graphs

      Public
      Apache License 2.0
      0000Updated Nov 15, 2024Nov 15, 2024
    • An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
      Jupyter Notebook
      Apache License 2.0
      248000Updated Nov 12, 2024Nov 12, 2024
    • LLM training code for MosaicML foundation models
      Python
      Apache License 2.0
      538000Updated Oct 24, 2024Oct 24, 2024
    • nm-vllm

      Public archive
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Other
      5.2k25700Updated Oct 11, 2024Oct 11, 2024
    • mteb

      Public
      MTEB: Massive Text Embedding Benchmark
      Jupyter Notebook
      Apache License 2.0
      298001Updated Oct 2, 2024Oct 2, 2024
    • 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
      Python
      Apache License 2.0
      28k9013Updated Oct 1, 2024Oct 1, 2024
    • AutoFP8

      Public
      Python
      Apache License 2.0
      24167103Updated Oct 1, 2024Oct 1, 2024
    • OmniQuant

      Public
      [ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
      Python
      MIT License
      58001Updated Sep 27, 2024Sep 27, 2024
    • An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
      Python
      MIT License
      494000Updated Sep 16, 2024Sep 16, 2024
    • Supercharge Your Model Training
      Python
      Apache License 2.0
      429000Updated Aug 27, 2024Aug 27, 2024
    • MixEval

      Public
      NM fork of MixEval compatible with SparseAutoModel.
      Python
      39001Updated Aug 20, 2024Aug 20, 2024
    • mamba

      Public
      Mamba SSM architecture
      Python
      Apache License 2.0
      1.2k000Updated Aug 12, 2024Aug 12, 2024
    • Causal depthwise conv1d in CUDA, with a PyTorch interface
      Cuda
      BSD 3-Clause "New" or "Revised" License
      70000Updated Aug 8, 2024Aug 8, 2024
    • sparseml

      Public
      Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
      Python
      Apache License 2.0
      1492.1k760Updated Aug 1, 2024Aug 1, 2024