This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
CUB 1.6.4
Summary
CUB 1.6.4 improves radix sorting performance for SM5x (Maxwell) and SM6x (Pascal) GPUs.
Enhancements
- Radix sort tuning policies updated for SM5x (Maxwell) and SM6x (Pascal) - 3.5B and 3.4B 32 byte keys/s on TitanX and GTX 1080, respectively.
Bug Fixes
- Restore fence work-around for scan (reduce-by-key, etc.) hangs in CUDA 8.5.
- #65:
cub::DeviceSegmentedRadixSort
should allow inputs to have pointer-to-const type. - Mollify Clang device-side warnings.
- Remove out-dated MSVC project files.