Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not working on AMD GPUs for bigger board sizes #12

Open
olepoeschl opened this issue May 23, 2023 · 3 comments
Open

Not working on AMD GPUs for bigger board sizes #12

olepoeschl opened this issue May 23, 2023 · 3 comments
Assignees
Labels
bug Something isn't working impl This issue is related to the impl module.

Comments

@olepoeschl
Copy link
Owner

olepoeschl commented May 23, 2023

Probably runs out of some resource, maybe private memory or local memory.
Try:

  • if this happens on other AMD GPUs too (it does)
  • batching the device workload into multiple smaller workloads, experiment with different workload sizes (didnt fix it)
  • reduce private memory usage (didnt fix it)
  • experiment with different workgroup sizes (didnt fix it)
  • disable compiler optimizations (didnt fix it) (crashes much earlier when using -O0)
  • there are catched but ignored InterruptedExceptions in GPUSolver, print them, maybe there is information
  • if all else fails: try to write a significantly different kernel for amd devices
    OR
  • convert project to cuda
@olepoeschl olepoeschl added the bug Something isn't working label May 23, 2023
@olepoeschl olepoeschl self-assigned this May 23, 2023
@olepoeschl olepoeschl changed the title Not working on RX 6650 XT for N>21 Not working on AMD GPUs for bigger board sizes May 31, 2023
@olepoeschl olepoeschl linked a pull request May 31, 2023 that will close this issue
@olepoeschl olepoeschl reopened this Jun 1, 2023
@olepoeschl olepoeschl removed a link to a pull request Jun 1, 2023
olepoeschl added a commit that referenced this issue Jun 21, 2023
…and global at the end after writing the result

refs #12
@olepoeschl
Copy link
Owner Author

olepoeschl commented Jun 21, 2023

FIXED:

  • added a local mem fence in each iteration of the main loop
  • added a global mem fence at the end of each kernel for flushing variable value to global memory result array
  • added a clWaitForEvents for each device for ensuring the device is finished before continuing in the code
  • also now using smaller data types in the kernel, but we are not sure if that contributed to the solution

see e39d6e5 and its previous commits

@olepoeschl olepoeschl added this to the 2.0.0 milestone Jun 21, 2023
@olepoeschl olepoeschl reopened this Jun 21, 2023
@olepoeschl
Copy link
Owner Author

olepoeschl commented Jun 21, 2023

Not solved. Just moved the border from 22 to 24 for RX 6650 XT

@olepoeschl
Copy link
Owner Author

@olepoeschl olepoeschl modified the milestones: 2.0.0, 2.0.1 Aug 16, 2023
@olepoeschl olepoeschl removed this from the 2.0.1 milestone Sep 30, 2023
@olepoeschl olepoeschl added the impl This issue is related to the impl module. label Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working impl This issue is related to the impl module.
Projects
None yet
Development

No branches or pull requests

2 participants