Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

CUB 1.7.0

Compare
Choose a tag to compare
@brycelelbach brycelelbach released this 19 May 08:55

Summary

CUB 1.7.0 brings support for CUDA 9.0 and SM7x (Volta) GPUs. It is compatible with independent thread scheduling. It was incorporated into Thrust 1.9.2.

Breaking Changes

  • Remove cub::WarpAll and cub::WarpAny. These functions served to emulate __all and __any functionality for SM1x devices, which did not have those operations. However, SM1x devices are now deprecated in CUDA, and the interfaces of these two functions are now lacking the lane-mask needed for collectives to run on SM7x and newer GPUs which have independent thread scheduling.

Other Enhancements

  • Remove any assumptions of implicit warp synchronization to be compatible with SM7x's (Volta) independent thread scheduling.

Bug Fixes

  • #86: Incorrect results with reduce-by-key.