-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: replace old warp size api #705
base: develop
Are you sure you want to change the base?
Conversation
Still in draft as we're doing an internal review in parallel. @stanleytsang-amd @RobsonRLemos feel to already run this through the tests to speed things along as I don't think much will change. I'll probably open up a pair of PRs to hipCUB and rocThrust to fix the compiler warnings from the deprecations tomorrow. |
8046d96
to
76f60f1
Compare
76f60f1
to
7765574
Compare
While applying the migration to the new API on hipCUB, I've found an unrollable loop which is fixed in 7765574. @stanleytsang-amd @RobsonRLemos Apologies for the delay, there were a few oversights that took a bit longer to resolve. |
…/warp algo types on host
We've found out that rocThrust does not need any other fixes as it contains no warp/block-level algorithms. In hindsight this makes sense 😅 |
{ | ||
storage_.start[offset + flat_tid] = tile_size; | ||
storage_.end[offset + flat_tid] = tile_size; | ||
const unsigned int offset_tid = offset + flat_tid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This basically undoes #672, which was required due to a compiler issue in 6.4. Are we confident the compiler issue is no longer present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good call... @umfranzw might have some more insight on the compiler bug.
I'll revert the change on Monday, but I'll have to also remove the unroll pragma as it otherwise throws warnings.
Required steps to get rid of
__AMDGCN_WAVEFRONT_SIZE
. This PR should be included in ROCm 6.5.::rocprim::arch::wavefront::min_size()
:::rocprim::wavefront::size()
felt strange as it really isn't a top level API (i.e. algorithm)::rocprim::arch::wavefront_size()
was also a possibility, but we might add more wavefront related functions. So scoping it made more sense.::rocprim::arch
was more appropriate so we can add more architecture related constants such as number of LDS banks, etc.min_size()
andmax_size()
are the same on device for now.