-
Notifications
You must be signed in to change notification settings - Fork 496
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
PR #20086: [NVIDIA GPU] Fix mem p2p init in collective permute thunk
Imported from GitHub PR #20086 Move pointer initialization to the thunk init stage instead of runtime to get rid of the runtime blocking wait. Add a device sync point using nccl allreduce before doing memcpy to make sure all gpus arrive at the same stage. Otherwise it's possible to have data corruptions when the receiving rank hasn't arrived at the memcpy. Copybara import of the project: -- ba4ad04 by TJ Xu <tjx@nvidia.com>: Moved pointer init to thunk init stage and add a sync point before doing memcpy to make sure data consistency across ranks -- 050bc59 by TJ Xu <tjx@nvidia.com>: Added e2e test for mem cpy p2p in a loop -- 1f75328 by TJ Xu <tjx@nvidia.com>: Added return status for cleanup functions Merging this change closes #20086 FUTURE_COPYBARA_INTEGRATE_REVIEW=#20086 from Tixxx:tixxx/memcpy_p2p_fix 1f75328 PiperOrigin-RevId: 707074350
- Loading branch information
1 parent
33f1796
commit 9366198
Showing
4 changed files
with
179 additions
and
19 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters