-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path0notes-fpu.txt
331 lines (270 loc) · 9.8 KB
/
0notes-fpu.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
SUMMARY 2106
- gave FPU its own repo
- jimmied up a bunch of beeyootiful CI tests
------------------------------------------------------------------------
SUMMARY 1906
Summary of what happened:
- initially used designware, but that's not free is it
- looked for freeware. looked here, looked there. freeware sucked
- finally wrote my own
------------------------------------------------------------------------
STATUS 1905
- seems stable for now, works on 47 fft regressions
- could use a severe cleaning up
- also needs severe testing, i.e. maybe find IEEE/IBM vectors
TODO
- put it in a separate repo...StanfordVLSI I guess
HOW TO TEST THE FPU
cd ../../fftgen/test
make fptest_64_1_1port
OR
cd ../../ftgen/test
make test_64_1_1port
../bin/fptest3.awk test_64_1_1port.log
# Sample output:
# % fptest3.awk test_128_1_1port.log$i |& less
# ...
# fptest o2i.FPU.SUB (0.414214 - 1.082392) = -0.668179 true
# fptest o2i.FPU..SUB (-0.414214 - -1.082392) = 0.668179 true
# fptest o1i.FPU..ADD (-0.414214 + -1.082392) = -1.496606 true
# fptest o2r.FPU..SUB (1.000000 - 0.000000) = 1.000000 true
# fptest o1r.FPU..ADD (1.000000 + 0.000000) = 1.000000 true
# fptest t2.FPU..ADD (-0.923880 + -0.158513) = -1.082392 true
# fptest t1.FPU.ADD (0.382683 + 0.382683) = 0.765367 true
# top_fft.BFLY0 t5 ------------------------
# ...
Okay 128 works again!
- severe cleanings up
- checkings in
Okay 128 works!
next:
- clean up
- update README status
- and check in
then:
- propagate ignore_a and b to srsub, right?
- recheck 128
- check in
then:
- regressions! haha
new filter:
fptest test_128_1_1port.log$i |& egrep ^fptest | sort \
| uniq | sed 's/^fptest top_fft.BFLY0.sub_//' | less
fptest top_fft.BFLY0.sub_o2r.SUB.SUBADDER (1.000000 - 1.000000) = 2.000000 FALSE (wanted 0.000000, off by 2.000000)
PROBLEM: 1 - 1 = 2
SUBADDER a= 1.000000 (3f800000) b= 1.000000 (3f800000) z= 2.000000 (40000000)
SUBADDER ae=7f be=7f
SUBADDER ae+bias=0 be+bias=0
SUBADDER ediff= 0
SUBADDER am=800000 bm=800000
SUBADDER am_adj=000000800000
SUBADDER bm_adj=000000800000
SUBADDER am+bm =000001000000
SUBADDER ze =7f
SUBADDER ze_final=80
SUBADDER zm =1000000
SUBADDER zm_final=000000
SUBADDER
NOW
- checked in and clean
NEXT
- update sradd w/new display/debug format
- update fptest3 to handle add, mul
- debug 128/1/1port
- start round6; look at all sr results in log; expect to see busted sradd, srmul at least
LATER
- continue debugging fptest
cd build; fptest2.awk test_128_1_1port.log$i |& less
- try always @ (z)
- implement "ntries" in fptest maybe
- remove FIXME TEST CODE in srsub, try again
- potential FIXME? fpu.vp makes three separate unique modules :( ?
CURRENT TEST SEQ
i=0
cd build/
alias fptest=/nobackup/steveri/github/fftgen/bin/fptest3.awk
../bin/golden_test.csh 128 1 1port|& tee tmp.log.srsub$i | tail
cp /tmp/test_128_1_1port.log test_128_1_1port.log$i
fptest test_64_1_1port.log$i | less (look for "false")
fptest test_128_1_1port.log$i |& grep fptest | sort | uniq | less
fptest test_128_1_1port.log$i |& egrep ^fptest | sort | uniq > tmp4; h40 tmp4
NEXT
- clean up srsub/sradd, use below test as your guide
- fix multiplier, use test below as your guide
- is mul still broken? see test(s) below
MULTIPLIER NOTES
# multiplier flowchart:
# https://www.quora.com/What-is-the-verilog-code-for-floating-point-multiplier
# sp format:
# https://www.quora.com/How-can-I-design-a-floating-point-multiplier-in-Verilog
# https://en.wikipedia.org/wiki/Single-precision_floating-point_format
==============================================================================
DONE
- update fptest and srsub to remove "srsub" from display out
DONE 1905
- srsub is GOOOOOD
-- b/c fixed fptest; ALL TRUE:
% alias fptest=/nobackup/steveri/github/fftgen/bin/fptest3.awk
% ../bin/golden_test.csh 128 1 1port|& tee tmp.log.srsub$i | tail
% cp /tmp/test_128_1_1port.log test_128_1_1port.log$i
% fptest test_128_1_1port.log$i |& grep fptest | sort | uniq | less
DONE/FIXED 1905
- DEBUG 2 - 1 = x
srsub top_fft.BFLY0.spsub_o2r.SUB a= 2.000000 (40000000) b= 1.000000 (3f800000) z= 0.000000 (xxxxxxxx)
srsub top_fft.BFLY0.spsub_o2r.SUB ae=80 be=7f
srsub top_fft.BFLY0.spsub_o2r.SUB ae+bias=1 be+bias=0
srsub top_fft.BFLY0.spsub_o2r.SUB ediff= x
srsub top_fft.BFLY0.spsub_o2r.SUB am=800000 bm=800000
srsub top_fft.BFLY0.spsub_o2r.SUB am_adj=xxxxxxxxxxxx
srsub top_fft.BFLY0.spsub_o2r.SUB bm_adj=xxxxxxxxxxxx
srsub top_fft.BFLY0.spsub_o2r.SUB a>b = x
srsub top_fft.BFLY0.spsub_o2r.SUB |am-bm| == xxxxxxxxxxxx
srsub top_fft.BFLY0.spsub_o2r.SUB ze =xx
srsub top_fft.BFLY0.spsub_o2r.SUB ze_final=xx
srsub top_fft.BFLY0.spsub_o2r.SUB zm =xxxxxx
srsub top_fft.BFLY0.spsub_o2r.SUB zm_final=xxxxxx
DONE 1905
- verified the following for srsub 4-1, 4-3, 1-4, 3-4
DONE
- undo test sequences in fpu.v (srsub.v?)
- fptest.awk test_128_1_1port.log$i | egrep '(true|false)$' | sort -rn | uniq | less
DONE
# okay, identified srsub error:
# srsub a= 4.000000 (40800000) b= 1.000000 (3f800000) z= 6.000000 (40c00000)
# ...by adding this test in fpu.vp:
# srsub
# SUB (.a($shortrealtobits(4.0)), .b($shortrealtobits(1.0)), .z(z));
DONE
- fix srsub
- resture fpu.vp
- continue testing, see below
DONE
- fix this broken thing in srsub.v
cd build
fptest.awk test_64_1_1port.log2 | sort -rn | uniq | less | head
srsub a= 1.000000 (3f800000) b=2097152.000000 (4a000000) z= 0.000000 (ca00000X) 1.000000 false
DONE
- clean up fptest.awk maybe
- assume add/sub is correct, clean up and check in
DONE
- adder works maybe, see below.
- clean up and check in
- see where we are with fft results, probably still broken
DONE
- 1+1=2 AND 1+.707=1.707 i think VICK TORY!
OLD
# cd build/
# ../bin/golden_test.csh 64 1 1port |& tee tmp.log.srsub$i | tail
# less -r /tmp/test_64_1_1port.log
# cp /tmp/test_64_1_1port.log test_64_1_1port.log$i
VERY OLD
# cd build/
# astest.sh test_64_1_1port.log$i | egrep 'add|sub'
OLD
# 11332 srsub a= 1.000000 (3f800000) b= 1.000000 (3f800000) z= 0.000000 (00000000)
#
# sradd a= 1.000000 (3f800000) b= 1.000000 (3f800000) z= 1.000000 (3f800000)
# sradd ae=7f be=7f
# sradd ae+bias=0 be+bias=0
# sradd ediff= 0
# sradd am=800000 bm=800000
# sradd am_adj=000000800000
# sradd bm_adj=000000800000
# sradd am+bm =000001000000
# sradd ze =7f
# sradd ze_final=7f
# sradd zm =1000000
# sradd zm_final=000000
#
#
# sradd a= 1.000000 (3f800000) b= 0.707107 (3f3504f3) z= 1.707107 (3fda8279)
# sradd ae=7f be=7e
# sradd ae+bias=0 be+bias=255
# sradd ediff= 1
# sradd am=800000 bm=b504f3
# sradd am_adj=000001000000
# sradd bm_adj=000000b504f3
# sradd am+bm =000001b504f3
# sradd ze =7e
# sradd ze_final=7f
# sradd zm =da8279
# sradd zm_final=5a8279
OLD
# build/round4/
# c; ../bin/golden_test.csh 8 1 1port |& tee tmp.log7
# diff tmp.log[67]
DONE
# sub kinda sorta seems to work;
# but now need add i.e. what happens when sign(a) != sign(b)
OLD
# make -f ../bin/../Makefile clean gen TOP=fft
# GENESIS_PARAMS="top_fft.n_fft_points=64 top_fft.units_per_cycle=1
# top_fft.SRAM_TYPE=TRUE_1PORT top_fft.swizzle_algorithm=round7" >&
# /tmp/test_ 64_1_1port.log
OLD
# vcs -sverilog +cli +memcbk +lint=all,noVCDE +libext+.v -notice -full64
# +v2k -debug_pp -timescale=1ps/1ps +noportcoerce +vcs+lic+wait
# -licqueue -ld gcc -top top_fft \ -y
# /nobackup/steveri/github/fftgen/build
# +incdir+/nobackup/steveri/github/fftgen/build -y
# /hd/cad/synopsys/dc_shell/latest/dw/sim_ver
# +incdir+/hd/cad/synopsys/dc_shell/latest/dw/sim_ver -y
# /nobackup/steveri/github/fftgen/rtl/lib
# +incdir+/nobackup/steveri/github/fftgen/rtl/lib -f
# /nobackup/steveri/github/fftgen/build/genesis_vlog.vf
DONE (right?)
- WHEN WE COME BACK
-- write'cher own dam' multplier
DONE
- clean up below
- save work so far
OLD
- cruzebarada it is!!!
- git clone https://github.com/CruzeBurada/Floating-Point-ALU-in-Verilog tmp.cruzeburada
- looking goody...
- no in the end it was baddy
OLD
# vcs -sverilog +cli +memcbk +lint=all,noVCDE +libext+.v -notice -full64 \
# +v2k -debug_pp -timescale=1ps/1ps +noportcoerce +vcs+lic+wait \
# -licqueue -ld gcc -top top_fft -y \
# /nobackup/steveri/github/fftgen/build \
# +incdir+/nobackup/steveri/github/fftgen/build -y \
# /hd/cad/synopsys/dc_shell/latest/dw/sim_ver/ \
# +incdir+/hd/cad/synopsys/dc_shell/latest/dw/sim_ver/ -y \
# /hd/cad/synopsys/dc_shell/latest/packages/gtech/src_ver/ \
# +incdir+/hd/cad/synopsys/dc_shell/latest/packages/gtech/src_ver/ -y \
# /nobackup/steveri/github/fftgen/rtl \
# +incdir+/nobackup/steveri/github/fftgen/rtl -f \
# /nobackup/steveri/github/fftgen/build/genesis_vlog.vf \
# |& less
------------------------------------------------------------------------------
NOTES from other (retired) 0notes file in $top/doc/
#
# DESIGNWARE
# /hd/cad/synopsys/dc_shell/latest/dw/doc/datasheets/dw_fp_mult.pdf
# /hd/cad/synopsys/dc_shell/latest/dw/sim_ver/DW_fp_mult.v
#
# MY CODE
# * op_width 64 (i.e. TEST5) means two 32-bit single-precision fp
# numbers (real, imag) packed into one 32-bit quantity!
#
#
# * all the fp ops are in rtl/butterfly.v e.g.
#
# module `mname` (
# ...
# input logic [`$op_width-1`:0] in1, in2,
# input logic [`$op_width/2-1`:0] twc, tws, // twiddle (cos, sin)
# output logic [`$op_width-1`:0] out1, out2
# );
# //; if ($test_mode eq "TEST0") { $op_width = 32; }
# //; if ($test_mode eq "TEST1") { $op_width = 64; }
# ...
# //; if ($test_mode eq "TEST5") { $op_width = 64; }
# ...
# DW_fp_addsub #(23, 8, 1) // #(sig, exp, compliance (0 or 1)) //cad/synopsys/dc_shell/latest/dw/sim_ver
# spadd_t2 (.a(t2a), .b(t2b), .z(t2), .op(`$add_op`), .rnd(3'b000), .status(status_t2));
#
# * maybe want stubs for add, sub, mul e.g.
#
# * TODO need a check in butterfly.v, should choke if op_width != 64