Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failure during undefined variable error reporting #57057

Open
maleadt opened this issue Jan 15, 2025 · 3 comments
Open

Assertion failure during undefined variable error reporting #57057

maleadt opened this issue Jan 15, 2025 · 3 comments
Labels
regression Regression in behavior compared to a previous version
Milestone

Comments

@maleadt
Copy link
Member

maleadt commented Jan 15, 2025

As observed on PkgEval during CUDSS testing: https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_date/2025-01/14/CUDSS.primary.log

This is a package that depends on CUDA, which isn't available on PkgEval, so the libcuda global variable as used by the Clang.jl-generated ccalls generates an undefined variable exception. However, that now triggers an assertion:

julia: /source/src/gc-stock.c:721: jl_gc_small_alloc_inner: Assertion `__extension__ ({ __auto_type __atomic_load_ptr = (&ptls->gc_state); __typeof__ (*__atomic_load_ptr) __atomic_load_tmp; __atomic_load (__atomic_load_ptr, &__atomic_load_tmp, (memory_order_relaxed)); __atomic_load_tmp; }) == 0' failed.

[193] signal 6 (-6): Aborted
in expression starting at /home/pkgeval/.julia/packages/CUDSS/L1Cat/test/runtests.jl:15
unknown function (ip: 0x78a467d5cebc) at /lib/x86_64-linux-gnu/libc.so.6
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x78a467cf8394) at /lib/x86_64-linux-gnu/libc.so.6
__assert_fail at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
jl_gc_small_alloc_inner at /source/src/gc-stock.c:721
jl_gc_small_alloc_noinline at /source/src/gc-stock.c:783 [inlined]
jl_gc_alloc_ at /source/src/gc-stock.c:797
ijl_new_struct at /source/src/datatype.c:1558
ijl_undefined_var_error at /source/src/rtutils.c:152
macro expansion at /home/pkgeval/.julia/packages/CUDA/L1qZp/lib/utils/call.jl:214 [inlined]
macro expansion at /home/pkgeval/.julia/packages/CUDSS/L1Cat/src/libcudss.jl:229 [inlined]
#cudssGetProperty##0 at /home/pkgeval/.julia/packages/CUDA/L1qZp/lib/utils/call.jl:35 [inlined]

Introduced somewhere in 4250be8...71bfbb3, maybe as part of the binding changes (cc @Keno)?

@maleadt maleadt added the regression Regression in behavior compared to a previous version label Jan 15, 2025
@maleadt maleadt added this to the 1.12 milestone Jan 15, 2025
@vtjnash
Copy link
Member

vtjnash commented Jan 17, 2025

This is a CUDA bug, since @ccall(jl_gc_safe_enter()::Int8) is UB to exist inside a julia program (not a legal function to call ever)

https://github.com/JuliaGPU/CUDA.jl/blob/d07a24572814efef6691005bd33ae7fd5f978f49/lib/utils/call.jl#L214

@vtjnash vtjnash closed this as completed Jan 17, 2025
@vtjnash vtjnash closed this as not planned Won't fix, can't repro, duplicate, stale Jan 17, 2025
@vtjnash vtjnash reopened this Jan 17, 2025
@maleadt
Copy link
Member Author

maleadt commented Jan 17, 2025

@ccall(jl_gc_safe_enter()::Int8) is UB to exist inside a julia program (not a legal function to call ever)

Well, we need it or we deadlock. Could you provide an alternative then, given that we don't have #49933?

cc @vchuravy, x-ref JuliaGPU/CUDA.jl#2262

@vtjnash
Copy link
Member

vtjnash commented Jan 17, 2025

There is no way to do this that isn't UB, other than to finish implementing #49933

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

2 participants