-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix duplicate kernel naming in reduce-then-scan kernels #2040
Fix duplicate kernel naming in reduce-then-scan kernels #2040
Conversation
Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Wow this is very sneaky, and also affects ALL submitters which accept forwarding references (not exclusive to policies). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I think we need to do an audit of all submitters to look for other problems where we might be compiling more kernels than we intend by accepting forwarding references.
As far as I remember we never fixed these problems by this way. More standard practice - to pass execution policy into submitters. What special things we have in modified submitters for use this approach? |
"To fix this, I have switched the function call operators of the submitters to accept a sycl::queue as this is all that is needed at this level."
|
The problem I mentioned here exists in every single submitter call. I just filed #2041 which uses reduce as an example. I do not think our testing tries to detect this issue. We just got lucky with the |
It is correct. I have not reviewed the other submitters. This patch is to fix a test issue to unblock internal CI. We need to consider how to best fix this issue in the long-term. One alternative would be to just accept |
Thanks, @mmichel11, after offline discussion I understood what happen... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this fix is correct, given that how we deal with the same type of error in other places.
The thing is with the proposed change we create an inconsistency precedent of how we handle kernel name in different places of the code. I don't see a good reason for that.
So, for now I would recommend to add ExecutionPolicy as a part of the name for kernel_generator, that we use for those kernel names.
Do we really want to compile multiple kernels based on what policy reference flavor we pass in to the submitter? |
We have a lot of usages of But I agree that problem really has place but we should have some common solution for all cases, not only for this one. Moreover, probably the usages of |
Signed-off-by: Matthew Michel <matthew.michel@intel.com>
@danhoeflinger I talked with @rarutyun here, and we came to the agreement that we should keep a consistent approach in the immediate-term (add As a follow-up, we need a library wide discussion on what approach should be used to address the problem in the long-term. |
This reverts commit 560480c.
@danhoeflinger, Yes. I agree with the problem. Let's fix it library-wide. Applying fixes here and there without having a oneDPL strategy could potentially hide the problem. So, +1 to all what Matt said. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this fixes the problem (I assume you've checked that, and it does) fill free to merge as soon as CI passes.
Thanks.
OK, no objections from me as long as we have a plan to fix this on the whole. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…iler optimization detection (#2046) Reverts #2040 and instead uses the `__OPTIMIZE__` macro defined by clang-based compilers to detect -O0 compilation and compile reduce-then-scan paths with a sub-group size of 16 to work around hardware bugs on older integrated graphics architectures. This avoids the performance impact of the kernel bundle approach. --------- Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Description
#2031 reverted the change in #1997 which always passed execution policies through reduce-then-scan as const l-value references. After this, we began to see duplicate kernel name compilation errors in
scan_2.pass
.This observed bug is caused by the fact that the execution policy is passed to the submitter function objects as a forwarding reference in the reduce-then-scan submitters. This means that even though the kernel names have been properly generated to be "unique", different cv / ref qualifiers of the execution policy will lead to separate function template instantiations and will be submitting separate kernels as a result. This subtle issue results in duplicate kernel names even if the kernels themselves are logically the same as the same name is used throughout.
To fix this, I have switched the function call operators of the submitters to accept a
sycl::queue
as this is all that is needed at this level.Further Details
Here is a minimal reproducer of the underlying issue to go alongside the explanation: