Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rc v1.2.0 #174

Merged
merged 4 commits into from
Feb 24, 2025
Merged

rc v1.2.0 #174

merged 4 commits into from
Feb 24, 2025

Conversation

anirudTT
Copy link
Contributor

@anirudTT anirudTT commented Feb 5, 2025

Changelog

  • remove HF token fron .env in tt-studio
  • startup.sh makes HOST_PERSISTENT_STORAGE_VOLUME if it doesnt exist
  • startup.sh uses safety set -euo pipefail
  • remove HF_TOKEN from app/docker-compose.yml
  • remove VLLM_LLAMA31_ENV_FILE now redundant
  • Adding Llama 3.x integration using new setup.sh and LLM code base
  • support multiple models using same container, adds support for MODEL_ID environment variable in tt-inference-server.
  • update volume initialization for new file permissions strategy
  • add SetupTypes to handle different first run and validation behaviour
  • hf_model_id is used to define model_id and model_name if provided (rename hf_model_path to hf_model_id)
  • /home/user/cache_root changed to /home/container_app_user/cache_root
  • fix get_devices_mounts, add mapping
  • use MODEL_ID if in container env_vars to map to impl model config
  • set defaults for ModelImpl
  • add configs for llama 3.x models
  • remove HF_TOKEN from tt-studio .env for ease of setup
  • add environment file processing

tstescoTT and others added 2 commits February 5, 2025 12:24
* remove HF token fron .env in tt-studio
* startup.sh makes HOST_PERSISTENT_STORAGE_VOLUME if it doesnt exist
* startup.sh uses safety set -euo pipefail
* remove HF_TOKEN from app/docker-compose.yml
* remove VLLM_LLAMA31_ENV_FILE now redundant
* Adding Llama 3.x integration using new setup.sh and LLM code base
* support multiple models using same container, adds support for MODEL_ID environment variable in tt-inference-server.
* update volume initialization for new file permissions strategy
* add SetupTypes to handle different first run and validation behaviour
* hf_model_id is used to define model_id and model_name if provided (rename hf_model_path to hf_model_id)
* /home/user/cache_root changed to /home/container_app_user/cache_root
* fix get_devices_mounts, add mapping
* use MODEL_ID if in container env_vars to map to impl model config
* set defaults for ModelImpl
* add configs for llama 3.x models
* remove HF_TOKEN from tt-studio .env for ease of setup
* add environment file processing
@anirudTT anirudTT changed the title Rc v1.2.0 rc v1.2.0 Feb 5, 2025
@anirudTT
Copy link
Contributor Author

Merge Pending Cherry Picking changes from this PR #189

* update readme to reflect new flow

* fix readme issues

* add Supported models tab:
pointing to tt-inference-server readme

* docs: Update main readme
- add better quick start guide 
- add better notes for running in development mode

* docs: re add Mock model steps

* docs: fix links

* docs: fix vllm

* Update HowToRun_vLLM_Models.md

Co-authored-by: Benjamin Goel <bgoel@tenstorrent.com>

* Update HowToRun_vLLM_Models.md
---------
Co-authored-by: Benjamin Goel <bgoel@tenstorrent.com>
@anirudTT anirudTT merged commit 4ff3029 into main Feb 24, 2025
2 checks passed
@anirudTT anirudTT deleted the rc-v1.2.0 branch February 24, 2025 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update readme for vllm models Merge in PR to support llama models update HowToRun_vLLM_Models.md
3 participants