From c8bea613ced0dda7a97caeacdc0c74f4fc56076c Mon Sep 17 00:00:00 2001 From: Almeet Bhullar Date: Thu, 12 Sep 2024 21:13:18 +0000 Subject: [PATCH] #0: [skip_ci] Updating BH bring-up programming guide --- README.md | 1 + .../BlackholeBringUpProgrammingGuide.md | 106 ++++++++++++++++-- 2 files changed, 98 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index bb93ebae2f4..d3d1c1e5255 100644 --- a/README.md +++ b/README.md @@ -95,3 +95,4 @@ Get started with [simple kernels](https://docs.tenstorrent.com/tt-metalium/lates - [Flash Attention on Wormhole](./tech_reports/FlashAttention/FlashAttention.md) (updated Sept 6th) - [CNNs on TT Architectures](./tech_reports/CNNs/ttcnn.md) (updated Sept 6th) - [Ethernet and Multichip Basics](./tech_reports/CCL/CclDeveloperGuide.md) (Updated Sept 12th) +- [Blackhole Bring-Up Prgramming Guide](./tech_reports/Blackhole/BlackholeBringUpProgrammingGuide.md) (Updated Sept 12th) diff --git a/tech_reports/Blackhole/BlackholeBringUpProgrammingGuide.md b/tech_reports/Blackhole/BlackholeBringUpProgrammingGuide.md index a296deb973d..1f0bbfb7de4 100644 --- a/tech_reports/Blackhole/BlackholeBringUpProgrammingGuide.md +++ b/tech_reports/Blackhole/BlackholeBringUpProgrammingGuide.md @@ -4,17 +4,105 @@ Information relevant to programming Blackhole while it is being brought up. -## Memory Alignment +## Wormhole N150 vs. Blackhole -- 32 bytes for L1 -- 64 bytes for DRAM + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TensixEthernetDRAMNoC
TotalAvailable for ComputeL1TotalProgrammability  TotalBank Size ProgrammabilityAlignmentsMulticast
DRAMPCIeL1
Wormhole N1508x108x81464 KB161x RISC-V
256 KB L1
12 banks1 GBN/ARead: 32B
Write: 16B
Read: 32B
Write: 16B
Read: 16B
Write: 16B
Rectangular
Blackhole14x1013x101464 KB
Data cache added
142x RISC-V
512 KB L1
8 banks~4 GB1x RISC-V
128 KB L1
Read: 64B
Write: 16B
Read: 64B
Write 16B
Read: 16B
Write: 16B
Rectangular
Strided
L-shaped
-## Command Buffer +### L1 Data Cache -BH cmd\_buffer is known to have issues, need to turn on cmd\_buffer\_fifo. -Instead of using noc\_cmd\_buf\_ready and cmd\_buffer for sending out mcast requests (as well as other read/write requests), -use cmd\_buffer\_fifo and CMD\_BUF\_AVAIL +Blackhole added a data cache in L1. Writing an address on one core and reading it from another only requires the reader to invalidate if the address was previously read. -## tt-smi +Invalidating the cache can be done via calls to `invalidate_l1_cache()` -Depending on the firmware, tt-smi reset may not work and the board will need to be rebooted. +The cache can be disabled through an env var: +``` +export TT_METAL_DISABLE_L1_DATA_CACHE_RISCVS= +``` + +### Ethernet Cores + +Runtime has not enabled access to second RISC-V on the ethernet cores yet. + +Fast dispatch can be run out of ethernet cores. + +### DRAM + +Runtime has not enabled access to program RISC-V on DRAM yet. + +### NoC + +Non-rectangular multicast shapes have not been tested yet. + +BH enabled 16-deep FIFOs for each of the four command buffers. These are enabled by default in `noc_init` as BH cmd\_buffer has known issues. NoC APIs are not impacted by this change. + +## Debug + +Debug tools are functional on BH and it is reccomended to use Watcher when triaging Op failures to catch potential alignment issues. Disabling the L1 cache can be helpful to identify missed cache invalidations. + +## Resetting + +Depending on the firmware, reset via `tt-smi -r 0` may not work and the board will need to be rebooted. + +## CI + +Bringing up full post commit is a WIP on BH, currently we only run the cpp tests. It is triggered on pushes to main but we have seen some instability with the machines with ND failures. + +## Issue Tracking + +Please file issues or any instances of ND behaviour to the Blackhole [board](https://github.com/orgs/tenstorrent/projects/50/views/1)