TT-Studio enables rapid deployment of TT Inference servers locally and is optimized for Tenstorrent hardware. This guide explains how to set up and use TT-Studio in both standard and development modes.
- Prerequisites
- Overview
- Quick Start
- For General Users
- Clone the Repository
- Set Up the Model Weights.
- Run the App via
startup.sh
- For Developers
- For General Users
- Using
startup.sh
- Documentation
- Frontend Documentation
- Backend API Documentation
- [Running vLLM Models in TT-Studio])
- Docker: Ensure that Docker is installed on your machine. You can refer to the installation guide here.
To set up TT-Studio:
-
Clone the Repository:
git clone https://github.com/tenstorrent/tt-studio.git cd tt-studio
-
Choose and Set Up the Model:
Select your desired model and configure its corresponding weights by following the instructions in HowToRun_vLLM_Models.md.
-
Run the Startup Script:
Run the
startup.sh
script:./startup.sh
See this section for more information on command-line arguments available within the startup script.
-
Access the Application:
The app will be available at http://localhost:3000.
-
Cleanup:
- To stop and remove Docker services, run:
./startup.sh --cleanup
- To stop and remove Docker services, run:
-
Running on a Remote Machine
To forward traffic between your local machine and a remote server, enabling you to access the frontend application in your local browser, follow these steps:
Use the following SSH command to port forward both the frontend and backend ports:
# Port forward frontend (3000) to allow local access from the remote server ssh -L 3000:localhost:3000 <username>@<remote_server>
⚠️ Note: To use Tenstorrent hardware, during the run ofstartup.sh
script, select "yes" when prompted to mount hardware. This will automatically configure the necessary settings, eliminating manual edits to docker compose.yml.
Developers can control and run the app directly via docker compose
, keeping this running in a terminal allows for hot reload of the frontend app.
-
Start the Application:
Navigate to the project directory and start the application:
cd tt-studio/app docker compose up --build
Alternatively, run the backend and frontend servers interactively:
docker compose up
To force a rebuild of Docker images:
docker compose up --build
-
Hot Reload & Debugging:
- The frontend supports hot reloading when running inside the
docker compose
environment. - Ensure that the required lines (71-73) in
docker-compose.yml
are uncommented.
-
Local files in
./api
are mounted to/api
within the container for development. -
Code changes trigger an automatic rebuild and redeployment of the Django server.
-
To manually start the Django development server:
./manage.py runserver 0.0.0.0:8000
- The frontend supports hot reloading when running inside the
-
Stopping the Services:
To shut down the application and remove running containers:
docker compose down
-
Using the Mock vLLM Model:
- For local testing, you can use the
Mock vLLM
model, which generates a random set of characters as output. - Instructions to run it are available here.
- For local testing, you can use the
-
Running on a Machine with Tenstorrent Hardware:
To run TT-Studio on a device with Tenstorrent hardware, you need to uncomment specific lines in the
app/docker-compose.yml
file. Follow these steps:-
Navigate to the
app
directory:cd app/
-
Open the
docker-compose.yml
file in an editor (e.g.,vim
or a code editor likeVS CODE
):vim docker-compose.yml # or code docker-compose.yml
-
Uncomment the following lines that have a
! flag
in front of them to enable Tenstorrent hardware support:#* DEV: Uncomment devices to use Tenstorrent hardware #! devices: #* mounts all Tenstorrent devices to the backend container #! - /dev/tenstorrent:/dev/tenstorrent
By uncommenting these lines, Docker will mount the Tenstorrent device (
/dev/tenstorrent
) to the backend container. This allows the docker container to utilize the Tenstorrent hardware for running machine learning models directly on the card.
-
The startup.sh
script automates the TT-Studio setup process. It can be run with or without Docker, depending on your usage scenario.
To use the startup script, run:
./startup.sh [options]
Option | Description |
---|---|
--help |
Display help message with usage details. |
--setup |
Run the setup.sh script with sudo privileges for all steps. |
--cleanup |
Stop and remove all Docker services. |
To display the same help section in the terminal, one can run:
./startup.sh --help
If a Tenstorrent device (/dev/tenstorrent
) is detected, the script will prompt you to mount it.
-
Frontend Documentation: app/frontend/README.md
Detailed documentation about the frontend of TT Studio, including setup, development, and customization guides. -
Backend API Documentation: app/api/README.md
Information on the backend API, powered by Django Rest Framework, including available endpoints and integration details. -
Running vLLM Model(s) and Mock vLLM Model in TT-Studio: HowToRun_vLLM_Models.md
Step-by-step instructions on how to configure and run the vLLM model(s) using TT-Studio. -
Contribution Guide: CONTRIBUTING.md
If you’re interested in contributing to the project, please refer to our contribution guidelines. This includes setting up a development environment, code standards, and the process for submitting pull requests. -
Frequently Asked Questions (FAQ): FAQ.md
A compilation of frequently asked questions to help users quickly solve common issues and understand key features of TT-Studio.