- Implementation list
0. Naive convolution (CPU)
- include/conv_cpu.cuh
- parallelized via OpenMP
- Naive convolution (GPU)
- include/conv_gpu_naive.cuh
- GEMM (im2col)
- include/conv_gpu_matmul.cuh
- (TODO) FFT
- (TODO) Strassen's method
- (TODO) Winograd's method
- Naive convolution (GPU)
- build
- make DEBUG=OFF
- Skip a routine for checking computation results
- make DEBUG=ON
- Do a routine for checking computation results
- make DEBUG=OFF
- execute
- make run