Releases · tenstorrent/tt-metal

14 Apr 15:26

v0.57.0-rc61

8ab6803

v0.57.0-rc61 Pre-release

Pre-release

Note

If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.

The changelog will now follow, showing the changes from last release.

This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/14449192444

📦 Uncategorized

Fix timeout in tt-metal-l2-nightly pipeline
- PR: #18845
#18843: Update BN sweep test
- PR: #18844
#0: Add test group per op for L2 nightly
- PR: #18846
Revert "Kernel profiler based NoC event tracing"
- PR: #18870
#0: (MINOR) Update to generate release candidates for v0.57.0
- PR: #18847
[skip ci] Skip the PR Gate Status job if the entire workflow was skipped
- PR: #18875
Fix regression on conv1d auto-shard
- PR: #18877
chore: update tt-llk path in the codeowners [skip ci]
- PR: #18879
#18657: Dockerize the workflow to build docs.
- PR: #18869
Split command_queue_interface.hpp into pieces
- PR: #18839
#0: Update Llama3_70b ground truth
- PR: #18871
#18887: Add job validator, skip metalium smoke test report
- PR: #18889
Consolidate Dispatch/S/Prefetch/Packet Queue NOC settings
- PR: #18878
Fix some clang-tidy violations from blfoat refactoring
- PR: #18897
Remove env_lib.hpp from public API
- PR: #18891
[skip ci] A Merge Gate workflow
- PR: #18896
Remove incorrect assertion
- PR: #18900
#0: [skip ci] Add Package and release to produce_data workflow
- PR: #18905
#0: [skip ci] Add (T3K) T3000 fast tests to produce_data pipeline
- PR: #18909
Fix specifying pad_value for from_torch for 0/1 rank tensors
- PR: #18920
Add barriers in WorkerToEdmReader/Sender when they terminate to ensure next EDMs don't hang on next iteration + add workaround 4xP150 chip to coordinate mapping file
- PR: #18913
#0: [skip ci] Update post commit workflows: remove cron job from python build wheel test and other cleanups
- PR: #18911
#0: switch noc for edm receiver-local chip path
- PR: #18883
Clean up llama3 directory
- PR: #18930
Scope out LaunchMessageRingBufferState from SystemMemoryManager
- PR: #18899
#18867: Dprint tile define typo fix
- PR: #18959
Ensure test_ccl_multi_cq_multi_device has Boost asio and lockfree
- PR: #18962
Split out tile.cpp hiding details in tt_metal/impl
- PR: #18917
Remove unused CompilationProfiler
- PR: #18882
Remove profiler.hpp from public API
- PR: #18928
Move L1BankingAllocator to tt_metal/impl
- PR: #18915
Derp. If we need it, it must trigger.
- PR: #18974
Add Perf Unit Tests for Llama TG
- PR: #18864
[skip ci] Remove CI as a build type - we never use it
- PR: #18965
Publish the docs for Release 56
- PR: #18967
Run T3K Fast Tests on push to main
- PR: #18916
[skip ci] a workflow for quick TG tests
- PR: #18908
Add prerplexity test option to choose your own
- PR: #18976
#0: Bringup TT-Mesh support on TG and add Tests
- PR: #18921
ttnn::arange bug fix
- PR: #18926
#0: fix t3k test and revert to use noc0 in edm
- PR: #18993
#18453: Delete usages of compilation reports because looks like we didn't delete them the first time around
- PR: #18996
chore: intake llk latest changes
- PR: #18987
Remove v1 and v0 inline namespace
- PR: #18972
Remove grayskull build targets
- PR: #18975
[skip ci] Clarify build-type option for CI/CD
- PR: #18970
Revert "Bump UMD with UBB changes"
- PR: #19003
[tt-train] Fix for RoPE for >1 batch size, use precise kernel config
- PR: #19011
Enable TTNN multi-device tests on N300
- PR: #19002
Extract pimpl from LightMetalReplay
- PR: #18971
[skip ci] New action to categorize changes on a branch
- PR: #19018
[skip ci] Only run Smoke tests if TT-Metalium, its tests, or CMake has changed
- PR: #19015
[skip ci] Use Debug for PR Gate :: Smoke
- PR: #19020
Remove tt_stl from public API
- PR: #19021
Remove get_platform_architecture.hpp from public API
- PR: #19029
Unify GlobalSemaphore and GlobalCircularBuffer MeshDevice support
- PR: #19019
Add CCL/MM Perf Unit Test Superset Integration
- PR: #19009
Move All Post Commit From Ubuntu 20.04 to Ubuntu 22.04
- PR: #19030
Remove bad impl/l1_banking_allocator.hpp include from device.hpp
- PR: #19013
#0: [skip ci] Increase target perf of LLM and remove CCL in TG model perf pipeline
- PR: #19031
Auto detect whether to do an incremental or a full scan with ClangTidy
- PR: #19033
Move sub_device_manager.hpp to impl
- PR: #18969
#19042: Fix building for non unity builds
- PR: #19043
dispatch_core_manager cleanup + remove from public API
- PR: #19012
Fix conv2d in NO_DISPATCH mode.
- PR: #18995
#0: Separate sfpi version setting from cmakefile
- PR: #18991
updated All_Reduce unit tests to use Llama CRS
- PR: #19023
#0: sfpi update 6.5.0
- PR: #19054
#18992: Set the correct canonical url for Google re-indexing
- PR: #18994
#18890 - Device Profiler NoC Tracing Feature
- PR: #18890
Add ring support for 1D EDM Fabric
- PR: #18873
#17725: Fix golden functions
- PR: #19048
[skip ci] Update CODEOWNERS
- PR: #19069
Update codeowners for TT transformers generator files and llm_demo_utils.py
- PR: #19067
[skip ci] Enable CMake check for circular deps
- PR: #19047
Revert "Add initial support for Llama3-8B & 70B on BH"
- PR: #19079
#19070: [skip ci] Add testcase data for additional workflows
- PR: #19073
Aliu/fabric init infra
- PR: #19036
#17447 #18912: [skip ci] Add Pipeline.pipeline_status, Job.tt_smi_version and Job.job_label
- PR: #18968
#0: [skip ci] Fix path to profiler-regression test reports
- PR: #19098
Upgrade and Dockerize Release Image Workflow
- PR: #19065
#0: SDPA internal mask is created in bfp4
- PR: #18894
[skip ci] Update README.md
- PR: #19107