-
Statistics Department of JNU
- Guangzhou, China
-
19:17
(UTC +08:00) - https://github.com/DefTruth
- https://www.zhihu.com/people/qyjdef
Pinned Loading
-
lite.ai.toolkit
lite.ai.toolkit Public🛠 A lite C++ toolkit of 100+ Awesome AI models, support ORT, MNN, NCNN, TNN and TensorRT. 🎉🎉
-
vllm-project/vllm
vllm-project/vllm PublicA high-throughput and memory-efficient inference and serving engine for LLMs
-
Awesome-LLM-Inference
Awesome-LLM-Inference Public📖A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. 🎉🎉
-
CUDA-Learn-Notes
CUDA-Learn-Notes Public📚150+ Tensor/CUDA Cores Kernels, ⚡️flash-attention-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS 🎉🎉).
-
Awesome-Diffusion-Inference
Awesome-Diffusion-Inference Public📖A curated list of Awesome Diffusion Inference Papers with codes, such as Sampling, Caching, Multi-GPUs, etc. 🎉🎉
-
hgemm-tensorcores-mma
hgemm-tensorcores-mma Public⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA PTX and CuTe API (Write for Fun 👀~)
Cuda 36
If the problem persists, check the GitHub status page or contact support.