Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check #24

Open
ckhung opened this issue Dec 8, 2023 · 2 comments

Comments

@ckhung
Copy link

ckhung commented Dec 8, 2023

Hi, my GPU is Sapphire Nitro+ Radeon RX580 and the host OS is linux mint debian edition 5 ("elsie"). I am using this command:
docker run -it --network=host --device=/dev/kfd --device=/dev/dri --group-add=video --ipc=host --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v ~/work:/work --name stb-dif l1naforever/stable-diffusion-rocm:latest
and it produces this error message:

Python 3.7.13 (default, Mar 29 2022, 02:18:16) 
[GCC 7.5.0]
Commit hash: 08b3f7aef15f74f4d2254b1274dd66fcc7940348
Traceback (most recent call last):
  File "launch.py", line 168, in <module>
    prepare_enviroment()
  File "launch.py", line 121, in prepare_enviroment
    run_python("import torch; assert torch.cuda.is_available(), 'Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check'")
  File "launch.py", line 56, in run_python
    return run(f'"{python}" -c "{code}"', desc, errdesc)
  File "launch.py", line 32, in run
    raise RuntimeError(message)
RuntimeError: Error running command.
Command: "/opt/conda/bin/python" -c "import torch; assert torch.cuda.is_available(), 'Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check'"
Error code: 134
stdout: <empty>
stderr: "hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"
Aborted (core dumped)

That's strange. I thought pytorch in this container image is suppoed to look for rocm, not nvidia's cuda, right? Thanks in advance for your help.

@FlorianHeigl
Copy link

@ckhung had the sme issue, it seems as if it could be a python version issue. I see py3.7 in your output.
looking at this comment in particular:
AUTOMATIC1111/stable-diffusion-webui#13824 (comment)

I'm trying to version lock the pip modules in Dockerfile and rebuild.

@dylanmilesmsu
Copy link

I have the same issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants