Stable Diffusion model template for Banana's serverless GPU platform.
Explore the docs »
View Demo
·
Report Bug
·
Request Feature
Table of Contents
This is a template file for deploying Stable Diffusion models on banana.dev - a platform that allows you to run inference on GPUs without having to worry about the infrastructure. Banana is a serverless platform that allows you to deploy your models in a few clicks. You can learn more about Banana here.
For those new to serverless, allow me to explain its role in my use case. I wanted to allow users of my site to generate images using Stable Diffusion models, however, keeping a GPU server running 24/7 is expensive. So I decided to use a serverless platform. This way, I only pay for the time my model is running, and I don't have to worry about the infrastructure. I can just focus on the model. For reference, my total bill for 800 image generations over a period of a month was only $5.00. Compare that with platforms like rundiffusion, which charges $0.5/hr at their lowest tier, and you can see why I chose to go with a serverless platform.
As mentioned above, this is a model template that you will deploy on Banana. While you can go through their docs for a more detailed explanation, I will provide a brief overview of the steps here.
- Create an account on banana.dev
- Link your payment method to your account (although they do offer free credits for new users)
- Once ready, move on to the usage section below
- Fork or clone this repository
- Go back to the banana dashboard, link your GitHub account, and provide permission to banana to access the repository you just forked/cloned
- On your Banana dashboard, click on the "Team" tab on the left sidebar
- You should now see an option to Manage your GitHub account. Click on that and follow the instructions to link your GitHub account and/or provide permission to banana to access your repository
- Click on the "Deploy" tab on the left sidebar. Choose "Deploy from GitHub" and select the repository you just forked/cloned
- That's it! If everything goes according to plan, you should be redirected to the model page, where you can keep track of its build progress. I've noticed that the first deploy always takes a while, around 45 minutes.
- Once the model is deployed, you can go to the "Settings" tab and add build arguments. Do this if you want to change the MODEL_ID - a unique identifier for your model, which you will find on the HuggingFace model page - since this template downloads the models from HuggingFace.
Here are some optimizations that I've found to be useful:
- Turn on Turboboot
- Once your model is deployed, go to the "Settings" tab and turn on Turboboot
- This will allow your model to start up much faster, reducing cold start times
- Change other settings as you see fit, but I've found the default parameters to be balanced enough for my use case
After deploying the model, you can now use Banana's API to start the inference engine i.e. the generation and download of images. It's a simple REST API, so feel free to use either the code on test.py
which includes some default code (modify the input arguments as you see fit). You may also use the provided Postman collection to test your model.
You will notice that you'll need your banana.dev API Key and Model Key, both of which you will find on the Banana dashboard. Another thing to note is that there are essentially two ways to call the API:
-
Get image result back in the same call you use to start the inference engine
- This is the default behavior, where you will get the image result back in the same call you use to start the inference engine. This means that you will have to wait a while to get a response back (around 45 seconds on average for a 512x512 image with no other generation in queue).
- I find this method difficult to handle exceptions for, especially at scale where you have multiple users in the queue waiting for their generation. The call times out after 60 secnds, so you may get the image back, or you might get back a Call ID, if the inference engine hasn't finished processing the image yet.
- Example payload:
POST https://api.banana.dev/start/v4/
{ "apiKey": "{{apiKey}}", "modelKey": "{{modelKey}}", "startOnly": false, "modelInputs": { "prompt": "nightclub", ...other model inputs } }
- Example Output:
Inference engine still running:
{ "id": "de874913-cf15-43ba-84be-f8b1121b62b0", "message": "", "created": 1684444779, "apiVersion": "January 11, 2023", "callID": "call_0d82c386-cb9f-4da6-89b4-236c4b69bb47", "finished": false, "modelOutputs": null }
Inference complete:
{ "id": "3b0fc375-6b0c-4552-bacc-5340a61e66d5", "message": "success", "created": 1684445192, "apiVersion": "January 11, 2023", "modelOutputs": [ { "image_base64": "" } ] }
-
Get image result back in a separate call
- This is the method I prefer, where you get back a Call ID in the same call you use to start the inference engine. You can then use this Call ID to get the image result back in a separate call.
- This way, you can handle exceptions much better, since you can just keep polling the API until you get back the image result.
- Example request/response:
Start inference engine
POST https://api.banana.dev/start/v4/
{ "apiKey": "{{apiKey}}", "modelKey": "{{modelKey}}", "startOnly": true, "modelInputs": { "prompt": "nightclub", ...other model inputs } }
Example Output:
{ "id": "a2394a1a-003c-4d99-a7f6-e8bb9fcb3fe3", "message": "", "created": 1684444926, "apiVersion": "January 11, 2023", "callID": "call_3612a41d-f033-4a3a-866e-f5eb4fed60e3", "finished": false, "modelOutputs": null }
Get image result
POST https://api.banana.dev/check/v4
{ "apiKey": "{{apiKey}}", "callID": "{{callID}}", "longPoll": false }
Example Output (Inference still running):
{ "id": "63fd9e3c-9903-47af-921a-edd93d10897b", "message": "running", "created": 1684445127, "apiVersion": "January 11, 2023", "modelOutputs": null }
Example Output (Inference complete):
{ "id": "3b0fc375-6b0c-4552-bacc-5340a61e66d5", "message": "success", "created": 1684445192, "apiVersion": "January 11, 2023", "modelOutputs": [ { "image_base64": "" } ] }
- Comprehensive logging
- Img2Img models
- ControlNet
See the open issues for a full list of proposed features (and known issues).
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE.txt
for more information.
Ehfaz Rezwan - @siliconinjax - ehfaz.rezwan@gmail.com
Project Link: https://github.com/ehfazrezwan/sd-serverless-template