Is the memory timing parameters for HBM2 in gem5 representative of real-world device values? #1578
-
As mentioned in the title, I have some doubts about the accuracy of the HBM2 memory timing parameters. # A single HBM2 x64 interface (tested with HBMCtrl in gem5)
# to be used as a single pseudo channel. The timings are based
# on HBM gen2 specifications. 4H stack, 8Gb per die and total capacity
# of 4GiB. I can not find any detailed information about HBM gen2 specifications. class HBM_2000_4H_1x64(DRAMInterface):
# 64-bit interface for a single pseudo channel
device_bus_width = 64
# HBM2 supports BL4
burst_length = 4
# size of channel in bytes, 4H stack of 8Gb dies is 4GiB per stack;
# with 16 pseudo channels, 256MiB per pseudo channel
device_size = "256MiB"
device_rowbuffer_size = "1KiB"
# 1x128 configuration
devices_per_rank = 1
ranks_per_channel = 1
banks_per_rank = 16
bank_groups_per_rank = 4
# 1000 MHz for 2Gbps DDR data rate
tCK = "1ns"
tRP = "14ns"
tCCD_L = "3ns"
tRCD = "12ns"
tRCD_WR = "6ns"
tCL = "18ns"
tCWL = "7ns"
tRAS = "28ns"
# BL4 in pseudo channel mode
# DDR @ 1000 MHz means 4 * 1ns / 2 = 2ns
tBURST = "2ns"
# value for 2Gb device from JEDEC spec
tRFC = "220ns"
# value for 2Gb device from JEDEC spec
tREFI = "3.9us"
tWR = "14ns"
tRTP = "5ns"
tWTR = "4ns"
tWTR_L = "9ns"
tRTW = "18ns"
# tAAD from RBus
tAAD = "1ns"
# single rank device, set to 0
tCS = "0ns"
tRRD = "4ns"
tRRD_L = "6ns"
# for a single pseudo channel
tXAW = "16ns"
activation_limit = 4
# 4tCK
tXP = "8ns"
# start with tRFC + tXP -> 160ns + 8ns = 168ns
tXS = "216ns"
page_policy = "close_adaptive"
read_buffer_size = 64
write_buffer_size = 64
two_cycle_activate = True |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
The Edit: I want to point out that it's important to correctly configure the memory system. For instance, the |
Beta Was this translation helpful? Give feedback.
The
HBM2Stack
(source) has been tested and is quite accurate compared to real devices. Most of the latency of HBM comes from (1) queueing in the memory controller, not the timing parameters of the DRAM and (2) the fact that it's clocked much lower the DDR4/5. HBM is 1-2 GHz (when taking into account the double data rate) and DDR4/5 is up to 5GHz+ now.Edit: I want to point out that it's important to correctly configure the memory system. For instance, the
HBM_2000_4H_1x64
code you posted is for a single (pseudo)channel. You must use 16-32 of these to make a stack, and you need to choose the correct interleaving to get accurate performance. The standard library'sHBM2Stack
should be config…