-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rc v1.2.0 #174
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* remove HF token fron .env in tt-studio * startup.sh makes HOST_PERSISTENT_STORAGE_VOLUME if it doesnt exist * startup.sh uses safety set -euo pipefail * remove HF_TOKEN from app/docker-compose.yml * remove VLLM_LLAMA31_ENV_FILE now redundant * Adding Llama 3.x integration using new setup.sh and LLM code base * support multiple models using same container, adds support for MODEL_ID environment variable in tt-inference-server. * update volume initialization for new file permissions strategy * add SetupTypes to handle different first run and validation behaviour * hf_model_id is used to define model_id and model_name if provided (rename hf_model_path to hf_model_id) * /home/user/cache_root changed to /home/container_app_user/cache_root * fix get_devices_mounts, add mapping * use MODEL_ID if in container env_vars to map to impl model config * set defaults for ModelImpl * add configs for llama 3.x models * remove HF_TOKEN from tt-studio .env for ease of setup * add environment file processing
tstescoTT
reviewed
Feb 6, 2025
tstescoTT
reviewed
Feb 6, 2025
bgoelTT
approved these changes
Feb 6, 2025
tstescoTT
approved these changes
Feb 6, 2025
Merge Pending Cherry Picking changes from this PR #189 |
This was
linked to
issues
Feb 24, 2025
* update readme to reflect new flow * fix readme issues * add Supported models tab: pointing to tt-inference-server readme * docs: Update main readme - add better quick start guide - add better notes for running in development mode * docs: re add Mock model steps * docs: fix links * docs: fix vllm * Update HowToRun_vLLM_Models.md Co-authored-by: Benjamin Goel <bgoel@tenstorrent.com> * Update HowToRun_vLLM_Models.md --------- Co-authored-by: Benjamin Goel <bgoel@tenstorrent.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changelog