Skip to content

Latest commit

 

History

History
19 lines (15 loc) · 498 Bytes

README.md

File metadata and controls

19 lines (15 loc) · 498 Bytes

hgemmtest

Trying to write a hgemm using opencl for tensor cores. Involves inline assembly

On Windows, please put hgemm.cl into the same folder as the generated hgemmtest.exe.

Tensor cores are available on NVIDIA GPUs with Volta or Turing architecture, including (from Wikipedia):

GeForce RTX 2080 Ti
GeForce RTX 2080
GeForce RTX 2070
Quadro RTX 8000
Quadro RTX 6000
Quadro RTX 5000
Tesla T4

Tesla V100
Titan V
Titan V CEO Edition
Quadro GV100