Skip to content

Commit

Permalink
try old wmma
Browse files Browse the repository at this point in the history
  • Loading branch information
LostRuins committed Feb 16, 2025
1 parent 15ae98c commit 505aede
Showing 1 changed file with 6 additions and 5 deletions.
11 changes: 6 additions & 5 deletions ggml/src/ggml-cuda/fattn.cu
Original file line number Diff line number Diff line change
Expand Up @@ -263,10 +263,11 @@ void ggml_cuda_flash_attn_ext(ggml_backend_cuda_context & ctx, ggml_tensor * dst
}

// The MMA implementation needs Turing or newer, use the old WMMA code for Volta:
if (cc == GGML_CUDA_CC_VOLTA) {
ggml_cuda_flash_attn_ext_wmma_f16(ctx, dst);
return;
}
ggml_cuda_flash_attn_ext_wmma_f16(ctx, dst);
// if (cc == GGML_CUDA_CC_VOLTA) {
// ggml_cuda_flash_attn_ext_wmma_f16(ctx, dst);
// return;
// }

ggml_cuda_flash_attn_ext_mma_f16(ctx, dst);
// ggml_cuda_flash_attn_ext_mma_f16(ctx, dst);
}

0 comments on commit 505aede

Please sign in to comment.