Fix duplicate kernel naming in reduce-then-scan kernels #2040

mmichel11 · 2025-02-03T17:51:23Z

Description

#2031 reverted the change in #1997 which always passed execution policies through reduce-then-scan as const l-value references. After this, we began to see duplicate kernel name compilation errors in scan_2.pass.

This observed bug is caused by the fact that the execution policy is passed to the submitter function objects as a forwarding reference in the reduce-then-scan submitters. This means that even though the kernel names have been properly generated to be "unique", different cv / ref qualifiers of the execution policy will lead to separate function template instantiations and will be submitting separate kernels as a result. This subtle issue results in duplicate kernel names even if the kernels themselves are logically the same as the same name is used throughout.

To fix this, I have switched the function call operators of the submitters to accept a sycl::queue as this is all that is needed at this level.

Further Details

Here is a minimal reproducer of the underlying issue to go alongside the explanation:

#include <sycl/sycl.hpp>
#include <utility>

template <typename KernelName, typename Dummy>
void foo(Dummy&&, sycl::queue q)
{
    q.submit([](sycl::handler& cgh){
        cgh.single_task<KernelName>([] {
            sycl::ext::oneapi::experimental::printf("hello world\n");
        });
    }).wait();
}

struct dummy
{
};

class kernel;

int main()
{
    sycl::queue q;
    dummy d;
    
    // We will see a compiler error due to duplicate kernel names.
    foo<kernel>(d, q);
    foo<kernel>(std::move(d), q);
}

Signed-off-by: Matthew Michel <matthew.michel@intel.com>

danhoeflinger · 2025-02-03T18:03:22Z

Wow this is very sneaky, and also affects ALL submitters which accept forwarding references (not exclusive to policies).

danhoeflinger

LGTM, I think we need to do an audit of all submitters to look for other problems where we might be compiling more kernels than we intend by accepting forwarding references.

SergeyKopienko · 2025-02-03T18:39:35Z

As far as I remember we never fixed these problems by this way. More standard practice - to pass execution policy into submitters. What special things we have in modified submitters for use this approach?

SergeyKopienko · 2025-02-03T18:44:14Z

"To fix this, I have switched the function call operators of the submitters to accept a sycl::queue as this is all that is needed at this level."

I think at this moment your approach is acceptable only in this one submitter. I think in other cases we need have execution policy too.

mmichel11 · 2025-02-03T18:45:54Z

As far as I remember we never fixed these problems by this way. More standard practice - to pass execution policy into submitters. What special things we have in modified submitters for use this approach?

The problem I mentioned here exists in every single submitter call. I just filed #2041 which uses reduce as an example. I do not think our testing tries to detect this issue. We just got lucky with the scan_2.pass failure to realize what is happening internally.

mmichel11 · 2025-02-03T18:54:56Z

"To fix this, I have switched the function call operators of the submitters to accept a sycl::queue as this is all that is needed at this level."

I think at this moment your approach is acceptable only in this one submitter. I think in other cases we need have execution policy too.

It is correct. I have not reviewed the other submitters. This patch is to fix a test issue to unblock internal CI. We need to consider how to best fix this issue in the long-term. One alternative would be to just accept _ExecutionPolicy by value.

SergeyKopienko · 2025-02-03T18:55:46Z

Thanks, @mmichel11, after offline discussion I understood what happen...

rarutyun

I don't think this fix is correct, given that how we deal with the same type of error in other places.

The thing is with the proposed change we create an inconsistency precedent of how we handle kernel name in different places of the code. I don't see a good reason for that.

So, for now I would recommend to add ExecutionPolicy as a part of the name for kernel_generator, that we use for those kernel names.

danhoeflinger · 2025-02-03T19:27:45Z

I don't think this fix is correct, given that how we deal with the same type of error in other places.

The thing is with the proposed change we create an inconsistency precedent of how we handle kernel name in different places of the code. I don't see a good reason for that.

So, for now I would recommend to add ExecutionPolicy as a part of the name for kernel_generator, that we use for those kernel names.

Do we really want to compile multiple kernels based on what policy reference flavor we pass in to the submitter?
It seems like this should be one call unified call to the submitter, and one kernel compilation. It can be a big overhead otherwise.

SergeyKopienko · 2025-02-03T19:29:19Z

I don't think this fix is correct, given that how we deal with the same type of error in other places.
The thing is with the proposed change we create an inconsistency precedent of how we handle kernel name in different places of the code. I don't see a good reason for that.
So, for now I would recommend to add ExecutionPolicy as a part of the name for kernel_generator, that we use for those kernel names.

Do we really want to compile multiple kernels based on what policy reference type we pass in to the submitter? It seems like this should be one call unified call to the submitter, and one kernel compilation. It can be a big overhead otherwise.

We have a lot of usages of __kernel_name_generator with _ExecutionPolicy in template params.
So it's looks like as common solution for now.

But I agree that problem really has place but we should have some common solution for all cases, not only for this one.

Moreover, probably the usages of __kernel_name_generator without _ExecutionPolicy in template params looks like more error-prone, I think. It works only by real use cases.

Signed-off-by: Matthew Michel <matthew.michel@intel.com>

mmichel11 · 2025-02-03T19:54:25Z

@danhoeflinger I talked with @rarutyun here, and we came to the agreement that we should keep a consistent approach in the immediate-term (add _ExecutionPolicy to the kernel name similar to what is done in radix sort). It still has this limitation as you noted, but it is the same for every other oneDPL algorithm.

As a follow-up, we need a library wide discussion on what approach should be used to address the problem in the long-term.

This reverts commit 560480c.

rarutyun · 2025-02-03T20:04:00Z

@danhoeflinger I talked with @rarutyun here, and we came to the agreement that we should keep a consistent approach in the immediate-term (add _ExecutionPolicy to the kernel name similar to what is done in radix sort). It still has this limitation as you noted, but it is the same for every other oneDPL algorithm.

As a follow-up, we need a library wide discussion on what approach should be used to address the problem in the long-term.

@danhoeflinger, Yes. I agree with the problem. Let's fix it library-wide. Applying fixes here and there without having a oneDPL strategy could potentially hide the problem. So, +1 to all what Matt said.

rarutyun

If this fixes the problem (I assume you've checked that, and it does) fill free to merge as soon as CI passes.

Thanks.

danhoeflinger · 2025-02-03T20:07:39Z

OK, no objections from me as long as we have a plan to fix this on the whole.

SergeyKopienko

LGTM

…iler optimization detection (#2046) Reverts #2040 and instead uses the `__OPTIMIZE__` macro defined by clang-based compilers to detect -O0 compilation and compile reduce-then-scan paths with a sub-group size of 16 to work around hardware bugs on older integrated graphics architectures. This avoids the performance impact of the kernel bundle approach. --------- Signed-off-by: Matthew Michel <matthew.michel@intel.com>

Add fix for duplicate kernel names due to policy cv-ref

560480c

Signed-off-by: Matthew Michel <matthew.michel@intel.com>

mmichel11 requested review from akukanov, dmitriy-sobolev, SergeyKopienko, danhoeflinger and adamfidel February 3, 2025 17:51

danhoeflinger previously approved these changes Feb 3, 2025

View reviewed changes

rarutyun requested changes Feb 3, 2025

View reviewed changes

Revert last commit and add _ExecutionPolicy to the kernel name

f22d006

Signed-off-by: Matthew Michel <matthew.michel@intel.com>

mmichel11 dismissed danhoeflinger’s stale review via f22d006 February 3, 2025 19:51

Revert "Add fix for duplicate kernel names due to policy cv-ref"

f089a90

This reverts commit 560480c.

rarutyun approved these changes Feb 3, 2025

View reviewed changes

mmichel11 merged commit a045eac into main Feb 3, 2025
22 checks passed

mmichel11 deleted the dev/mmichel11/fix_duplicate_kernel_names_scan_2 branch February 3, 2025 20:28

SergeyKopienko reviewed Feb 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix duplicate kernel naming in reduce-then-scan kernels #2040

Fix duplicate kernel naming in reduce-then-scan kernels #2040

mmichel11 commented Feb 3, 2025 •

edited

Loading

danhoeflinger commented Feb 3, 2025 •

edited

Loading

danhoeflinger left a comment

SergeyKopienko commented Feb 3, 2025

SergeyKopienko commented Feb 3, 2025

mmichel11 commented Feb 3, 2025

mmichel11 commented Feb 3, 2025 •

edited

Loading

SergeyKopienko commented Feb 3, 2025

rarutyun left a comment

danhoeflinger commented Feb 3, 2025 •

edited

Loading

SergeyKopienko commented Feb 3, 2025 •

edited

Loading

mmichel11 commented Feb 3, 2025 •

edited

Loading

rarutyun commented Feb 3, 2025

rarutyun left a comment

danhoeflinger commented Feb 3, 2025

SergeyKopienko left a comment

Fix duplicate kernel naming in reduce-then-scan kernels #2040

Fix duplicate kernel naming in reduce-then-scan kernels #2040

Conversation

mmichel11 commented Feb 3, 2025 • edited Loading

Description

Further Details

danhoeflinger commented Feb 3, 2025 • edited Loading

danhoeflinger left a comment

Choose a reason for hiding this comment

SergeyKopienko commented Feb 3, 2025

SergeyKopienko commented Feb 3, 2025

mmichel11 commented Feb 3, 2025

mmichel11 commented Feb 3, 2025 • edited Loading

SergeyKopienko commented Feb 3, 2025

rarutyun left a comment

Choose a reason for hiding this comment

danhoeflinger commented Feb 3, 2025 • edited Loading

SergeyKopienko commented Feb 3, 2025 • edited Loading

mmichel11 commented Feb 3, 2025 • edited Loading

rarutyun commented Feb 3, 2025

rarutyun left a comment

Choose a reason for hiding this comment

danhoeflinger commented Feb 3, 2025

SergeyKopienko left a comment

Choose a reason for hiding this comment

mmichel11 commented Feb 3, 2025 •

edited

Loading

danhoeflinger commented Feb 3, 2025 •

edited

Loading

mmichel11 commented Feb 3, 2025 •

edited

Loading

danhoeflinger commented Feb 3, 2025 •

edited

Loading

SergeyKopienko commented Feb 3, 2025 •

edited

Loading

mmichel11 commented Feb 3, 2025 •

edited

Loading