Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround for CUDA 12.6 tuple_size issue #3773. #3785

Merged
merged 1 commit into from
Sep 9, 2024

Conversation

asmorkalov
Copy link
Contributor

Fixes #3773

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@asmorkalov
Copy link
Contributor Author

cc @cudawarped @chacha21. I want to drop gridTransformTuple and some hacky things on top of tuple, because it's hard to maintain it with different CUDA versions. What do you think about the solution and further cleanup?

@cudawarped
Copy link
Contributor

@asmorkalov Unfortunatley it removes the fusion as the source has to be read from global memory twice but I can't think of a less involved fix. The alternative of adding two ops to gridTransformUnary as you did with gridTransformBinary is probably overkill as we should probably be moving away from cudev and towards Thrust for these operations if the CUDA namespace has a future.

@asmorkalov
Copy link
Contributor Author

@cudawarped Thanks for the opinion.

@chacha21
Copy link
Contributor

chacha21 commented Sep 4, 2024

Another solution would be to drop the cudev/functional abstraction for polarToCart/cartToPolar and use dedicated CUDA kernels to handle the different cases

@asmorkalov asmorkalov merged commit 0377a6a into opencv:4.x Sep 9, 2024
11 checks passed
@asmorkalov asmorkalov mentioned this pull request Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CUDA 12.6 build errors
3 participants