Skip to content
@pallas-inference

Pallas Inference Server

Pallas is a LLM inference server

Popular repositories Loading

  1. vllm vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python

  2. triton-inference-server triton-inference-server Public

    Forked from triton-inference-server/server

    The Triton Inference Server provides an optimized cloud and edge inferencing solution.

    Python

  3. unilm-yoco unilm-yoco Public

    Forked from microsoft/unilm

    Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

    Python

Repositories

Showing 3 of 3 repositories
  • vllm Public Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    pallas-inference/vllm’s past year of commit activity
    Python 0 Apache-2.0 5,252 0 0 Updated Nov 3, 2024
  • triton-inference-server Public Forked from triton-inference-server/server

    The Triton Inference Server provides an optimized cloud and edge inferencing solution.

    pallas-inference/triton-inference-server’s past year of commit activity
    Python 0 BSD-3-Clause 1,530 0 0 Updated Nov 2, 2024
  • unilm-yoco Public Forked from microsoft/unilm

    Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

    pallas-inference/unilm-yoco’s past year of commit activity
    Python 0 MIT 2,640 0 0 Updated Oct 30, 2024

Top languages

Loading…

Most used topics

Loading…