Workaround for CUDA 12.6 tuple_size issue #3773. #3785

asmorkalov · 2024-09-03T10:52:28Z

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

asmorkalov · 2024-09-03T10:58:03Z

cc @cudawarped @chacha21. I want to drop gridTransformTuple and some hacky things on top of tuple, because it's hard to maintain it with different CUDA versions. What do you think about the solution and further cleanup?

cudawarped · 2024-09-03T13:13:36Z

@asmorkalov Unfortunatley it removes the fusion as the source has to be read from global memory twice but I can't think of a less involved fix. The alternative of adding two ops to gridTransformUnary as you did with gridTransformBinary is probably overkill as we should probably be moving away from cudev and towards Thrust for these operations if the CUDA namespace has a future.

asmorkalov · 2024-09-04T06:20:09Z

@cudawarped Thanks for the opinion.

chacha21 · 2024-09-04T20:28:13Z

Another solution would be to drop the cudev/functional abstraction for polarToCart/cartToPolar and use dedicated CUDA kernels to handle the different cases

Workaround for CUDA 12.6 tuple_size issue opencv#3773.

09eb618

asmorkalov added the category: cuda label Sep 3, 2024

asmorkalov merged commit 0377a6a into opencv:4.x Sep 9, 2024
11 checks passed

asmorkalov mentioned this pull request Sep 10, 2024

5.x merge 4.x #3793

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workaround for CUDA 12.6 tuple_size issue #3773. #3785

Workaround for CUDA 12.6 tuple_size issue #3773. #3785

asmorkalov commented Sep 3, 2024

asmorkalov commented Sep 3, 2024

cudawarped commented Sep 3, 2024

asmorkalov commented Sep 4, 2024

chacha21 commented Sep 4, 2024

Workaround for CUDA 12.6 tuple_size issue #3773. #3785

Workaround for CUDA 12.6 tuple_size issue #3773. #3785

Conversation

asmorkalov commented Sep 3, 2024

Pull Request Readiness Checklist

asmorkalov commented Sep 3, 2024

cudawarped commented Sep 3, 2024

asmorkalov commented Sep 4, 2024

chacha21 commented Sep 4, 2024