ParCIS Lab, BUPT
Popular repositories Loading
-
FlashSparse
FlashSparse PublicFlashSparse significantly reduces the computation redundancy for unstructured sparsity (for SpMM and SDDMM) on Tensor Cores through a Swap-and-Transpose mapping strategy. FlashSparse is accepted by…
-
DNN-cpp-proxies
DNN-cpp-proxies PublicC++/MPI proxies for distributed training of deep neural networks.
C++ 1
Repositories
- FlashSparse Public
FlashSparse significantly reduces the computation redundancy for unstructured sparsity (for SpMM and SDDMM) on Tensor Cores through a Swap-and-Transpose mapping strategy. FlashSparse is accepted by PPoPP 2025.
ParCIS/FlashSparse’s past year of commit activity - Ok-Topk Public
Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k communication volume which is asymptotically optimal) with the decentralized parallel Stochastic Gradient Descent (SGD) optimizer, and its convergence is proved theoretically and empirically.
ParCIS/Ok-Topk’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…