This benchmark is used to evaluate the performance of the roller, please install tvm and CUDA toolkit(> 12.0) before running this benchmark.
Collecting data for RTX4090
cd artifacts/roller
bash bench_roller.sh RTX4090
Collecting data for RTX3090
cd artifacts/roller
bash bench_roller.sh RTX3090