Skip to content

job queues for real-time AI inference

License

Notifications You must be signed in to change notification settings

SaladTechnologies/rq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Real-Time Queue on SaladCloud

This repository provides resources to build a custom real-time queue for applications on SaladCloud.

It supports the following use cases:

  • Real-time AI inference for tasks like LLMs, transcription, image generation, and more.
  • Node-to-node communication within the same container group or across different groups.
  • Performance and resource monitoring for SCE workloads.

Several customers have successfully implemented similar solutions on SaladCloud, demonstrating the following advantages:

  • Supports both asynchronous and synchronous calls, with results provided in streaming or non-streaming modes.
  • Enables regional deployment to ensure local access and minimize latency.
  • More resilient to burst traffic, node failures, and the variability in AI inference times.
  • Flexible, customizable, and platform-independent.

overview

Please refer to [this guide] (https://docs.salad.com/guides/real-time-inference/build-redis-queue) for more details.

About

job queues for real-time AI inference

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages