Skip to content

kubernetes-sigs/gateway-api-inference-extension

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Gateway API Inference Extension

The Gateway API Inference Extension came out of wg-serving and is sponsored by SIG Network. This repo contains: the load balancing algorithm, ext-proc code, CRDs, and controllers of the extension.

This extension is intented to provide value to multiplexed LLM services on a shared pool of compute. See the proposal for more info.

Status

This project is currently in development.

For more rapid testing, our PoC is in the ./examples/ dir.

Getting Started

Install the CRDs into the cluster:

make install

Delete the APIs(CRDs) from the cluster:

make uninstall

Deploying the ext-proc image Refer to this README on how to deploy the Ext-Proc image.

Contributing

Our community meeting is weekly at Th 10AM PDT; zoom link here.

We currently utilize the #wg-serving slack channel for communications.

Contributions are readily welcomed, thanks for joining us!

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.