Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault with Polyester.jl #2213

Closed
kiranshila opened this issue Dec 20, 2024 · 3 comments
Closed

Segfault with Polyester.jl #2213

kiranshila opened this issue Dec 20, 2024 · 3 comments

Comments

@kiranshila
Copy link

kiranshila commented Dec 20, 2024

Seemingly a different issue from #2208

I tried to reduce this example a bit, apologies for some missing math/context - but this should reproduce the error:

using Polyester, StatsBase, Enzyme, StaticArrays

@inline function lossless_abcd(z, βl)
    sbl, cbl = sincos(βl)
    @SMatrix [cbl z*sbl*im; (1/z)*sbl*im cbl]
end

function cascade(zs, βl)
    acc = lossless_abcd(first(zs), βl)
    for z in zs
        acc *= lossless_abcd(z, βl)
    end
    acc
end

function cascade!(A, zs, L, freqs)
    δ = L / length(freqs)
    # Solve each frequency point in parallel
    @batch for i in eachindex(freqs, A)
        βl = freqs[i]
        A[i] = cascade(zs, βl)
    end
    A
end

function objective!(A, L, zs, freqs)
    cascade!(A, zs, L, freqs)
    # Generate some scalar for the MWE
    sum(abs2.(sum.(A)))
end

function d_objective!(dzs, zs, A, shadow_A, L, freqs)
    Enzyme.make_zero!(shadow_A)
    Enzyme.make_zero!(dzs)
    Enzyme.autodiff(Enzyme.Reverse, objective!, Active,
        Duplicated(A, A_shadow),
        Const(L),
        Duplicated(zs, dzs),
        Const(freqs))
    dzs
end

freqs = ones(100)
zs = ones(100)
dzs = similar(zs)
A = Array{SMatrix{2,2,ComplexF64,4}}(undef, length(freqs))
A_shadow = similar(A)

d_objective!(dzs, zs, A, A_shadow, 1.0, freqs)

The error returned is

[1229359] signal 11 (1): Segmentation fault
in expression starting at /path/to/file.jl:62
_ZN4llvm21SymbolTableListTraitsINS_14GlobalVariableEE18removeNodeFromListEPS1_ at /home/kshila/.julia/juliaup/julia-1.11.2+0.x64.linux.gnu/bin/../lib/julia/libLLVM-16jl.so (unknown line)
unknown function (ip: 0x16ac14f)
Allocations: 23260080 (Pool: 23259464; Big: 616); GC: 19

Things seem to work correctly if I remove that @batch call in cascade!.

Versions, etc

Julia Version 1.11.2
Commit 5e9a32e7af2 (2024-12-01 20:02 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 24 × Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, skylake-avx512)
Threads: 24 default, 0 interactive, 12 GC (on 24 virtual cores)
Environment:
  JULIA_EDITOR = code
  [7da242da] Enzyme v0.13.24
  [f517fe37] Polyester v0.7.16
@kiranshila
Copy link
Author

Or perhaps this is the same issue? Here's a trace running in Julia 1.10

ERROR: LoadError: Current scope: 
; Function Attrs: mustprogress willreturn
define internal fastcc void @preprocess_julia_cascade__596({} addrspace(10)* noundef nonnull align 16 dereferenceable(40) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="123247878569488" "enzymejl_parmtype_ref"="2" %0, {} addrspace(10)* noundef nonnull align 16 dereferenceable(40) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="123247705933040" "enzymejl_parmtype_ref"="2" %1, {} addrspace(10)* noundef nonnull align 16 dereferenceable(40) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="123247705933040" "enzymejl_parmtype_ref"="2" %2) unnamed_addr #93 !dbg !3590 {
top:
  %3 = call noalias nonnull dereferenceable(48) dereferenceable_or_null(48) i8* @malloc(i64 48), !enzyme_fromstack !142
  %4 = bitcast i8* %3 to { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }*, !enzyme_caststack !0
  %.sub = bitcast { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }* %4 to i8*
  %5 = call noalias nonnull dereferenceable(64) dereferenceable_or_null(64) i8* @malloc(i64 64), !enzymejl_allocart !2322, !enzyme_type !329, !enzyme_fromstack !89
  %6 = bitcast i8* %5 to [1 x [4 x [2 x double]]]*, !enzyme_caststack !0
  %7 = call noalias nonnull dereferenceable(16) dereferenceable_or_null(16) i8* @malloc(i64 16), !enzyme_fromstack !89
  %newstruct71 = bitcast i8* %7 to { i64, [1 x i64] }*, !enzyme_caststack !0
  %8 = call noalias nonnull dereferenceable(64) dereferenceable_or_null(64) i8* @malloc(i64 64), !enzymejl_allocart !2322, !enzyme_type !329, !enzyme_fromstack !89
  %9 = bitcast i8* %8 to [1 x [4 x [2 x double]]]*, !enzyme_caststack !0
  %10 = call noalias nonnull dereferenceable(64) dereferenceable_or_null(64) i8* @malloc(i64 64), !enzymejl_allocart !2322, !enzyme_type !329, !enzyme_fromstack !89
  %11 = bitcast i8* %10 to [1 x [4 x [2 x double]]]*, !enzyme_caststack !0
  %12 = call {}*** @julia.get_pgcstack() #94
  %current_task1156 = getelementptr inbounds {}**, {}*** %12, i64 -14
  %current_task1 = bitcast {}*** %current_task1156 to {}**
  %ptls_field157 = getelementptr inbounds {}**, {}*** %12, i64 2
  %13 = bitcast {}*** %ptls_field157 to i64***
  %ptls_load158159 = load i64**, i64*** %13, align 8, !tbaa !66
  %14 = getelementptr inbounds i64*, i64** %ptls_load158159, i64 2
  %safepoint = load i64*, i64** %14, align 8, !tbaa !70
  fence syncscope("singlethread") seq_cst
  call void @julia.safepoint(i64* %safepoint) #94, !dbg !3591
  fence syncscope("singlethread") seq_cst
  %15 = call i64 @julia_nthreads_790() #95, !dbg !3592
  %.not = icmp eq i64 %15, 1, !dbg !3594
  br i1 %.not, label %L4, label %L51, !dbg !3595

L4:                                               ; preds = %top
  %16 = addrspacecast {} addrspace(10)* %2 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !3596
  %arraylen_ptr = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %16, i64 0, i32 1, !dbg !3596
  %arraylen = load i64, i64 addrspace(11)* %arraylen_ptr, align 8, !dbg !3596, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %17 = addrspacecast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !3603
  %arraylen_ptr2 = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %17, i64 0, i32 1, !dbg !3603
  %arraylen3 = load i64, i64 addrspace(11)* %arraylen_ptr2, align 8, !dbg !3603, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %.not160 = icmp eq i64 %arraylen, %arraylen3, !dbg !3610
  br i1 %.not160, label %L25, label %L14, !dbg !3609

L14:                                              ; preds = %L4
  %box = call noalias nonnull dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247752959712 to {}*) to {} addrspace(10)*)) #96, !dbg !3609
  %memcpy_refined_dst = bitcast {} addrspace(10)* %box to i64 addrspace(10)*, !dbg !3609, !enzyme_inactive !0
  store i64 %arraylen, i64 addrspace(10)* %memcpy_refined_dst, align 8, !dbg !3609, !tbaa !230, !alias.scope !81, !noalias !3612
  %box31 = call noalias nonnull dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247752959712 to {}*) to {} addrspace(10)*)) #96, !dbg !3609
  %memcpy_refined_dst33 = bitcast {} addrspace(10)* %box31 to i64 addrspace(10)*, !dbg !3609, !enzyme_inactive !0
  store i64 %arraylen3, i64 addrspace(10)* %memcpy_refined_dst33, align 8, !dbg !3609, !tbaa !230, !alias.scope !81, !noalias !3612
  %18 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* noundef nonnull @ijl_invoke, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247893132112 to {}*) to {} addrspace(10)*), {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247774899088 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 123247744295344 to {}*) to {} addrspace(10)*), {} addrspace(10)* nofree nonnull %box, {} addrspace(10)* nofree nonnull %box31) #95, !dbg !3609
  unreachable, !dbg !3609

L25:                                              ; preds = %L4
  %.not161 = icmp eq i64 %arraylen, 0, !dbg !3615
  br i1 %.not161, label %L392, label %L34.preheader, !dbg !3602

L34.preheader:                                    ; preds = %L25
  %19 = addrspacecast {} addrspace(10)* %2 to double addrspace(13)* addrspace(11)*
  %20 = addrspacecast {} addrspace(10)* %0 to [1 x [4 x [2 x double]]] addrspace(13)* addrspace(11)*
  %21 = bitcast [1 x [4 x [2 x double]]]* %9 to i8*
  br label %L34, !dbg !3619

L34:                                              ; preds = %idxend16, %L34.preheader
  %iv12 = phi i64 [ %iv.next13, %idxend16 ], [ 0, %L34.preheader ]
  %iv.next13 = add nuw nsw i64 %iv12, 1, !dbg !3619
  %22 = add nsw i64 %iv.next13, -1, !dbg !3619
  %arraylen10 = load i64, i64 addrspace(11)* %arraylen_ptr, align 8, !dbg !3619, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %inbounds = icmp ult i64 %22, %arraylen10, !dbg !3619
  br i1 %inbounds, label %idxend, label %oob, !dbg !3619

L51:                                              ; preds = %top
  %23 = addrspacecast {} addrspace(10)* %2 to {} addrspace(11)*, !dbg !3621
  %24 = addrspacecast {} addrspace(10)* %2 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !3621
  %arraylen_ptr34 = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %24, i64 0, i32 1, !dbg !3621
  %arraylen35 = load i64, i64 addrspace(11)* %arraylen_ptr34, align 8, !dbg !3621, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %25 = addrspacecast {} addrspace(10)* %0 to {} addrspace(11)*, !dbg !3628
  %26 = addrspacecast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !3628
  %arraylen_ptr37 = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %26, i64 0, i32 1, !dbg !3628
  %arraylen38 = load i64, i64 addrspace(11)* %arraylen_ptr37, align 8, !dbg !3628, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %.not165 = icmp eq i64 %arraylen35, %arraylen38, !dbg !3635
  br i1 %.not165, label %L72, label %L61, !dbg !3634

L61:                                              ; preds = %L51
  %box108 = call noalias nonnull dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247752959712 to {}*) to {} addrspace(10)*)) #96, !dbg !3634
  %memcpy_refined_dst110 = bitcast {} addrspace(10)* %box108 to i64 addrspace(10)*, !dbg !3634, !enzyme_inactive !0
  store i64 %arraylen35, i64 addrspace(10)* %memcpy_refined_dst110, align 8, !dbg !3634, !tbaa !230, !alias.scope !81, !noalias !3612
  %box112 = call noalias nonnull dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247752959712 to {}*) to {} addrspace(10)*)) #96, !dbg !3634
  %memcpy_refined_dst114 = bitcast {} addrspace(10)* %box112 to i64 addrspace(10)*, !dbg !3634, !enzyme_inactive !0
  store i64 %arraylen38, i64 addrspace(10)* %memcpy_refined_dst114, align 8, !dbg !3634, !tbaa !230, !alias.scope !81, !noalias !3612
  %27 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* noundef nonnull @ijl_invoke, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247893132112 to {}*) to {} addrspace(10)*), {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247774899088 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 123247744295344 to {}*) to {} addrspace(10)*), {} addrspace(10)* nofree nonnull %box108, {} addrspace(10)* nofree nonnull %box112) #95, !dbg !3634
  unreachable, !dbg !3634

L72:                                              ; preds = %L51
  %.not167 = icmp eq i64 %arraylen35, 0, !dbg !3637
  br i1 %.not167, label %L392, label %L76, !dbg !3639

L76:                                              ; preds = %L72
  %28 = call i64 @llvm.smin.i64(i64 %15, i64 %arraylen35) #94, !dbg !3641
  %.not168 = icmp eq i64 %28, 0, !dbg !3643
  br i1 %.not168, label %L383.lr.ph, label %L84, !dbg !3644

L84:                                              ; preds = %L76
  %29 = trunc i64 %28 to i32, !dbg !3645
  %30 = add i32 %29, -1, !dbg !3645
  %31 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef addrspacecast ({}* inttoptr (i64 123247601534016 to {}*) to {} addrspace(11)*)) #97, !dbg !3649
  %32 = icmp sgt i32 %30, 0, !dbg !3651
  br i1 %32, label %L94, label %L383.lr.ph, !dbg !3652

L94:                                              ; preds = %L84
  %p.i = bitcast {}* %31 to i64*, !dbg !3654
  %v.i = atomicrmw xchg i64* %p.i, i64 0 acq_rel, align 8, !dbg !3654
  %33 = call i64 @llvm.ctpop.i64(i64 %v.i) #94, !dbg !3657, !range !2422
  %34 = trunc i64 %33 to i32, !dbg !3659
  %35 = sub nsw i32 %30, %34, !dbg !3660
  %36 = icmp slt i32 %35, 0, !dbg !3662
  br i1 %36, label %L107, label %L142, !dbg !3665

L107:                                             ; preds = %L94
  %37 = call i64 @llvm.ctlz.i64(i64 %v.i, i1 noundef false) #94, !dbg !3666, !range !2422
  %38 = trunc i64 %37 to i32, !dbg !3668
  br label %L110, !dbg !3669

L110:                                             ; preds = %L110, %L107
  %iv = phi i64 [ %iv.next, %L110 ], [ 0, %L107 ]
  %value_phi95 = phi i32 [ %38, %L107 ], [ %39, %L110 ]
  %value_phi96 = phi i32 [ %35, %L107 ], [ %48, %L110 ]
  %value_phi97 = phi i64 [ %v.i, %L107 ], [ %44, %L110 ]
  %iv.next = add nuw nsw i64 %iv, 1, !dbg !3670
  %39 = sub i32 %value_phi95, %value_phi96, !dbg !3670
  %40 = sub i32 64, %39, !dbg !3672
  %41 = zext i32 %40 to i64, !dbg !3674
  %42 = icmp ugt i32 %40, 63, !dbg !3674
  %notmask = shl nsw i64 -1, %41, !dbg !3672
  %.op = xor i64 %notmask, -1, !dbg !3672
  %43 = select i1 %42, i64 -1, i64 %.op, !dbg !3672
  %44 = and i64 %43, %value_phi97, !dbg !3675
  %45 = xor i64 %44, %value_phi97, !dbg !3677
  %46 = call i64 @llvm.ctpop.i64(i64 %45) #94, !dbg !3678, !range !2422
  %47 = trunc i64 %46 to i32, !dbg !3680
  %48 = add i32 %value_phi96, %47, !dbg !3681
  %.not176 = icmp eq i32 %48, 0, !dbg !3682
  br i1 %.not176, label %L131, label %L110, !dbg !3683

L131:                                             ; preds = %L110
  %49 = xor i64 %44, -1, !dbg !3684
  %50 = and i64 %v.i, %49, !dbg !3686
  store atomic i64 %50, i64* %p.i release, align 16, !dbg !3687, !noalias !3688
  br label %L142, !dbg !3669

L142:                                             ; preds = %L131, %L94
  %value_phi48 = phi i32 [ %30, %L131 ], [ %34, %L94 ]
  %value_phi49 = phi i64 [ %44, %L131 ], [ %v.i, %L94 ]
  %51 = icmp sgt i32 %value_phi48, 0, !dbg !3689
  br i1 %51, label %L198.lr.ph, label %L383.lr.ph, !dbg !3690

L198.lr.ph:                                       ; preds = %L142
  %52 = zext i32 %value_phi48 to i64, !dbg !3691
  %53 = add nuw nsw i64 %52, 1, !dbg !3708
  %54 = udiv i64 %arraylen35, %53, !dbg !3710
  %55 = mul i64 %54, %53, !dbg !3711
  %56 = sub i64 %arraylen35, %55, !dbg !3713
  %57 = call nonnull "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" {}* @julia.pointer_from_objref({} addrspace(11)* noundef %23) #97, !dbg !3714
  %58 = bitcast {}* %57 to i8**, !dbg !3714
  %arrayptr52 = load i8*, i8** %58, align 8, !dbg !3714, !tbaa !245, !alias.scope !183, !noalias !184, !nonnull !0, !enzyme_type !329, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0
  %59 = ptrtoint i8* %arrayptr52 to i64, !dbg !3714
  %arraylen54 = load i64, i64 addrspace(11)* %arraylen_ptr34, align 8, !dbg !3724, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %60 = call nonnull "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" {}* @julia.pointer_from_objref({} addrspace(11)* noundef %25) #97, !dbg !3730
  %61 = bitcast {}* %60 to i8**, !dbg !3730
  %arrayptr59 = load i8*, i8** %61, align 8, !dbg !3730, !tbaa !245, !alias.scope !183, !noalias !184, !nonnull !0, !enzyme_type !329, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BSMatrix\7B2\2C\202\2C\20ComplexF64\2C\204\7D\7D !0
  %62 = ptrtoint i8* %arrayptr59 to i64, !dbg !3730
  %arraylen61 = load i64, i64 addrspace(11)* %arraylen_ptr37, align 8, !dbg !3736, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %63 = addrspacecast {} addrspace(10)* %1 to {} addrspace(11)*, !dbg !3714
  %64 = call nonnull "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" {}* @julia.pointer_from_objref({} addrspace(11)* noundef %63) #97, !dbg !3714
  %65 = bitcast {}* %64 to i8**, !dbg !3714
  %arrayptr66 = load i8*, i8** %65, align 8, !dbg !3714, !tbaa !245, !alias.scope !183, !noalias !184, !nonnull !0, !enzyme_type !329, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0
  %66 = ptrtoint i8* %arrayptr66 to i64, !dbg !3714
  %67 = addrspacecast {} addrspace(10)* %1 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !3724
  %arraylen_ptr67 = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %67, i64 0, i32 1, !dbg !3724
  %arraylen68 = load i64, i64 addrspace(11)* %arraylen_ptr67, align 8, !dbg !3724, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %68 = getelementptr inbounds { i64, [1 x i64] }, { i64, [1 x i64] }* %newstruct71, i64 0, i32 1, i64 0, !dbg !3742
  store i64 %arraylen68, i64* %68, align 8, !dbg !3742, !tbaa !94, !alias.scope !96, !noalias !3743
  %69 = getelementptr inbounds { i64, [1 x i64] }, { i64, [1 x i64] }* %newstruct71, i64 0, i32 0, !dbg !3744
  store i64 %66, i64* %69, align 8, !dbg !3744, !tbaa !94, !alias.scope !96, !noalias !3743
  %newstruct72.sroa.4.32..sroa_cast = bitcast { i64, [1 x i64] }* %newstruct71 to i8*, !dbg !3721
  %newstruct72.sroa.0.sroa.0.0.newstruct72.sroa.0.0..sroa_cast.sroa_cast = getelementptr inbounds { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }, { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }* %4, i64 0, i32 0, i32 0, !dbg !3747
  store i64 %59, i64* %newstruct72.sroa.0.sroa.0.0.newstruct72.sroa.0.0..sroa_cast.sroa_cast, align 16, !dbg !3747, !tbaa !462, !alias.scope !2556, !noalias !3748
  %newstruct72.sroa.0.sroa.2.0.newstruct72.sroa.0.0..sroa_cast.sroa_cast = getelementptr inbounds { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }, { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }* %4, i64 0, i32 0, i32 1, i64 0, !dbg !3747
  store i64 %arraylen54, i64* %newstruct72.sroa.0.sroa.2.0.newstruct72.sroa.0.0..sroa_cast.sroa_cast, align 8, !dbg !3747, !tbaa !462, !alias.scope !2556, !noalias !3748
  %newstruct72.sroa.2.0..sroa_idx128 = getelementptr inbounds { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }, { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }* %4, i64 0, i32 1, i32 0, !dbg !3747
  store i64 %62, i64* %newstruct72.sroa.2.0..sroa_idx128, align 16, !dbg !3747, !tbaa !462, !alias.scope !2556, !noalias !3748
  %newstruct72.sroa.3.0..sroa_idx129 = getelementptr inbounds { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }, { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }* %4, i64 0, i32 1, i32 1, i64 0, !dbg !3747
  store i64 %arraylen61, i64* %newstruct72.sroa.3.0..sroa_idx129, align 8, !dbg !3747, !tbaa !462, !alias.scope !2556, !noalias !3748
  %newstruct72.sroa.4.0..sroa_idx = getelementptr inbounds { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }, { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }* %4, i64 0, i32 2, !dbg !3747
  %newstruct72.sroa.4.0..sroa_cast = bitcast { i64, [1 x i64] }* %newstruct72.sroa.4.0..sroa_idx to i8*, !dbg !3747
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 16 dereferenceable(16) %newstruct72.sroa.4.0..sroa_cast, i8* noundef nonnull align 8 dereferenceable(16) %newstruct72.sroa.4.32..sroa_cast, i64 noundef 16, i1 noundef false) #94, !dbg !3747
  %70 = call token (...) @llvm.julia.gc_preserve_begin({} addrspace(10)* nonnull %2, {} addrspace(10)* nonnull %0, {} addrspace(10)* nonnull %1) #94, !dbg !3721
  %71 = icmp sgt i64 %56, -1
  br label %L198, !dbg !3749

L264.preheader:                                   ; preds = %L252
  %value_phi84189 = add i64 %83, 1, !dbg !3750
  %.not173190 = icmp sgt i64 %value_phi84189, %arraylen35, !dbg !3751
  br i1 %.not173190, label %L331.preheader, label %L281.lr.ph, !dbg !3752

L281.lr.ph:                                       ; preds = %L264.preheader
  %72 = addrspacecast { i64, [1 x i64] }* %newstruct71 to { i64, [1 x i64] } addrspace(11)*
  %73 = bitcast [1 x [4 x [2 x double]]]* %11 to i8*
  %74 = add i64 %54, %value_phi76194, !dbg !3754
  %umin = call i1 @llvm.umin.i1(i1 %80, i1 %71), !dbg !3752
  %75 = zext i1 %umin to i64, !dbg !3752
  %76 = add i64 %74, %75, !dbg !3754
  br label %L281, !dbg !3752

L198:                                             ; preds = %L252, %L198.lr.ph
  %iv2 = phi i64 [ %iv.next3, %L252 ], [ 0, %L198.lr.ph ]
  %value_phi78196 = phi i64 [ %value_phi49, %L198.lr.ph ], [ %89, %L252 ]
  %value_phi76194 = phi i64 [ 0, %L198.lr.ph ], [ %83, %L252 ]
  %value_phi75193 = phi i32 [ 0, %L198.lr.ph ], [ %85, %L252 ]
  %iv.next3 = add nuw nsw i64 %iv2, 1, !dbg !3760
  %77 = icmp ne i64 %value_phi78196, 0, !dbg !3760
  call void @llvm.assume(i1 noundef %77) #94, !dbg !3763
  %78 = call i64 @llvm.cttz.i64(i64 %value_phi78196, i1 noundef true) #94, !dbg !3764, !range !2422
  %79 = trunc i64 %78 to i32, !dbg !3766
  %80 = icmp ugt i64 %56, %iv2, !dbg !3767
  %not.ifelse_cond79 = and i1 %71, %80, !dbg !3771
  %81 = zext i1 %not.ifelse_cond79 to i64, !dbg !3771
  %82 = add i64 %value_phi76194, %54, !dbg !3771
  %83 = add i64 %82, %81, !dbg !3772
  %84 = add nuw nsw i32 %79, 1, !dbg !3773
  %85 = add i32 %84, %value_phi75193, !dbg !3775
  %86 = zext i32 %84 to i64, !dbg !3777
  %87 = lshr i64 %value_phi78196, %86, !dbg !3777
  %88 = icmp eq i32 %79, 63, !dbg !3777
  %89 = select i1 %88, i64 0, i64 %87, !dbg !3777
  %90 = load i64, i64* inttoptr (i64 123246368836896 to i64*), align 32, !dbg !3779, !tbaa !77, !alias.scope !81, !noalias !84
  %91 = shl i32 %85, 9, !dbg !3785
  %92 = zext i32 %91 to i64, !dbg !3786
  %93 = inttoptr i64 %90 to i8*, !dbg !3790
  %94 = getelementptr i8, i8* %93, i64 %92, !dbg !3790
  %95 = getelementptr i8, i8* %94, i64 8, !dbg !3791
  %coercion = bitcast i8* %95 to i64*, !dbg !3797
  store i64 ptrtoint (void (i64)* @jlcapi_BatchClosure_600 to i64), i64* %coercion, align 1, !dbg !3797, !tbaa !134, !alias.scope !81, !noalias !3612
  %96 = getelementptr i8, i8* %94, i64 16, !dbg !3801
  %97 = bitcast i8* %96 to { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }**, !dbg !3805
  store { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }* %4, { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }** %97, align 1, !dbg !3805, !tbaa !134, !alias.scope !81, !noalias !3612
  %98 = getelementptr i8, i8* %94, i64 24, !dbg !3809
  %coercion81 = bitcast i8* %98 to i64*, !dbg !3813
  store i64 %value_phi76194, i64* %coercion81, align 1, !dbg !3813, !tbaa !134, !alias.scope !81, !noalias !3612
  %99 = getelementptr i8, i8* %94, i64 32, !dbg !3817
  %coercion82 = bitcast i8* %99 to i64*, !dbg !3821
  store i64 %83, i64* %coercion82, align 1, !dbg !3821, !tbaa !134, !alias.scope !81, !noalias !3612
  %p.i119 = bitcast i8* %94 to i32*, !dbg !3825
  %v.i120 = atomicrmw xchg i32* %p.i119, i32 0 acq_rel, align 4, !dbg !3825
  %.not172 = icmp eq i32 %v.i120, 1, !dbg !3828
  br i1 %.not172, label %L249, label %L252, !dbg !3829

L249:                                             ; preds = %L198
  call fastcc void @julia_wake_thread__782(i32 zeroext %85) #94, !dbg !3829
  br label %L252, !dbg !3829

L252:                                             ; preds = %L249, %L198
  %100 = icmp eq i64 %iv.next3, %52, !dbg !3830
  br i1 %100, label %L264.preheader, label %L198, !dbg !3749

L331.preheader.loopexit:                          ; preds = %L281
  br label %L331.preheader, !dbg !3832

L331.preheader:                                   ; preds = %L331.preheader.loopexit, %L264.preheader
  %101 = icmp eq i64 %value_phi49, 0, !dbg !3832
  br i1 %101, label %L368, label %L336.preheader, !dbg !3834

L336.preheader:                                   ; preds = %L331.preheader
  br label %L336, !dbg !3835

L281:                                             ; preds = %L281, %L281.lr.ph
  %iv4 = phi i64 [ %iv.next5, %L281 ], [ 0, %L281.lr.ph ]
  %102 = add i64 %76, %iv4, !dbg !3754
  %iv.next5 = add nuw nsw i64 %iv4, 1, !dbg !3754
  %103 = add i64 %value_phi84189, %iv4, !dbg !3754
  %104 = shl i64 %102, 3, !dbg !3754
  %105 = getelementptr i8, i8* %arrayptr52, i64 %104, !dbg !3838
  %coercion86 = bitcast i8* %105 to double*, !dbg !3839
  %pointerref = load double, double* %coercion86, align 1, !dbg !3839, !tbaa !134, !alias.scope !81, !noalias !84
  call void @llvm.lifetime.end.p0i8(i64 noundef 48, i8* noundef nonnull %.sub) #94
  call fastcc void @julia_cascade_770([1 x [4 x [2 x double]]]* noalias nocapture nofree noundef nonnull writeonly sret([1 x [4 x [2 x double]]]) align 8 dereferenceable(64) %11, { i64, [1 x i64] } addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(16) %72, double %pointerref) #94, !dbg !3843
  %106 = shl i64 %102, 6, !dbg !3844
  %coercion88 = getelementptr i8, i8* %arrayptr59, i64 %106, !dbg !3848
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 1 dereferenceable(64) %coercion88, i8* noundef nonnull align 8 dereferenceable(64) %73, i64 noundef 64, i1 noundef false) #94, !dbg !3849, !noalias !3688
  %value_phi84 = add i64 %103, 1, !dbg !3750
  %exitcond200 = icmp eq i64 %103, %arraylen35, !dbg !3751
  br i1 %exitcond200, label %L331.preheader.loopexit, label %L281, !dbg !3752

L336:                                             ; preds = %L336.preheader, %L366
  %iv6 = phi i64 [ 0, %L336.preheader ], [ %iv.next7, %L366 ]
  %value_phi92188 = phi i64 [ %111, %L366 ], [ %value_phi49, %L336.preheader ]
  %value_phi91187 = phi i32 [ %113, %L366 ], [ 0, %L336.preheader ]
  %iv.next7 = add nuw nsw i64 %iv6, 1, !dbg !3853
  %107 = call i64 @llvm.cttz.i64(i64 %value_phi92188, i1 noundef true) #94, !dbg !3853, !range !2422
  %108 = trunc i64 %107 to i32, !dbg !3855
  %109 = add nuw nsw i32 %108, 1, !dbg !3856
  %110 = zext i32 %109 to i64, !dbg !3858
  %111 = lshr i64 %value_phi92188, %110, !dbg !3858
  %112 = icmp eq i32 %108, 63, !dbg !3858
  %113 = add i32 %109, %value_phi91187, !dbg !3860
  %114 = load i64, i64* inttoptr (i64 123246368836896 to i64*), align 32, !dbg !3862, !tbaa !77, !alias.scope !81, !noalias !84
  %115 = shl i32 %113, 9, !dbg !3865
  %116 = zext i32 %115 to i64, !dbg !3866
  %117 = inttoptr i64 %114 to i8*, !dbg !3870
  %118 = getelementptr i8, i8* %117, i64 %116, !dbg !3870
  %p.i121 = bitcast i8* %118 to i32*, !dbg !3871
  %v.i122184 = load atomic i32, i32* %p.i121 acquire, align 16, !dbg !3871
  %.not174185 = icmp eq i32 %v.i122184, 0, !dbg !3873
  br i1 %.not174185, label %L356.preheader, label %L366, !dbg !3835

L356.preheader:                                   ; preds = %L336
  br label %L356, !dbg !3874

L356:                                             ; preds = %L356.preheader, %L363
  %iv8 = phi i64 [ 0, %L356.preheader ], [ %iv.next9, %L363 ]
  %119 = trunc i64 %iv8 to i32
  %iv.next9 = add nuw nsw i64 %iv8, 1
  call void @llvm.lifetime.end.p0i8(i64 noundef 48, i8* noundef nonnull %.sub) #94
  call void asm sideeffect "pause", "~{memory}"() #98, !dbg !3875
  %120 = add i32 %119, 1, !dbg !3877
  %121 = icmp ult i32 %120, 65537, !dbg !3878
  br i1 %121, label %L363, label %L360, !dbg !3874

L360:                                             ; preds = %L356
  %122 = call fastcc i8 @julia_checktask_612(i32 zeroext %113) #94, !dbg !3880
  %123 = and i8 %122, 1, !dbg !3880
  %.not175 = icmp eq i8 %123, 0, !dbg !3880
  br i1 %.not175, label %L363, label %L366.loopexit, !dbg !3880

L363:                                             ; preds = %L360, %L356
  %v.i122 = load atomic i32, i32* %p.i121 acquire, align 16, !dbg !3871
  %.not174 = icmp eq i32 %v.i122, 0, !dbg !3873
  br i1 %.not174, label %L356, label %L366.loopexit, !dbg !3835

L366.loopexit:                                    ; preds = %L360, %L363
  br label %L366, !dbg !3832

L366:                                             ; preds = %L366.loopexit, %L336
  %124 = icmp eq i64 %111, 0, !dbg !3832
  %125 = select i1 %112, i1 true, i1 %124, !dbg !3832
  br i1 %125, label %L368.loopexit, label %L336, !dbg !3834

L368.loopexit:                                    ; preds = %L366
  br label %L368, !dbg !3881

L368:                                             ; preds = %L368.loopexit, %L331.preheader
  %v.i118 = atomicrmw or i64* %p.i, i64 %value_phi49 acq_rel, align 8, !dbg !3881
  call void @llvm.julia.gc_preserve_end(token %70) #94, !dbg !3884
  br label %L392, !dbg !3884

L383.lr.ph:                                       ; preds = %L142, %L84, %L76
  %126 = addrspacecast {} addrspace(10)* %2 to double addrspace(13)* addrspace(11)*
  %127 = addrspacecast {} addrspace(10)* %0 to [1 x [4 x [2 x double]]] addrspace(13)* addrspace(11)*
  %128 = bitcast [1 x [4 x [2 x double]]]* %6 to i8*
  %umax = call i64 @llvm.umax.i64(i64 %arraylen35, i64 noundef 1) #94, !dbg !3885
  br label %L383, !dbg !3885

L383:                                             ; preds = %L383, %L383.lr.ph
  %iv10 = phi i64 [ %iv.next11, %L383 ], [ 0, %L383.lr.ph ]
  %iv.next11 = add nuw nsw i64 %iv10, 1, !dbg !3887
  %129 = add nsw i64 %iv.next11, -1, !dbg !3887
  %arrayptr44170 = load double addrspace(13)*, double addrspace(13)* addrspace(11)* %126, align 16, !dbg !3887, !tbaa !245, !alias.scope !3890, !noalias !184, !nonnull !0, !enzyme_type !329, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0
  %130 = getelementptr inbounds double, double addrspace(13)* %arrayptr44170, i64 %129, !dbg !3887
  %arrayref45 = load double, double addrspace(13)* %130, align 8, !dbg !3887, !tbaa !325, !alias.scope !81, !noalias !84
  call fastcc void @julia_cascade_787([1 x [4 x [2 x double]]]* noalias nocapture nofree noundef nonnull writeonly sret([1 x [4 x [2 x double]]]) align 8 dereferenceable(64) %6, {} addrspace(10)* noundef nonnull align 16 dereferenceable(40) %1, double %arrayref45) #94, !dbg !3891
  %arrayptr47171 = load [1 x [4 x [2 x double]]] addrspace(13)*, [1 x [4 x [2 x double]]] addrspace(13)* addrspace(11)* %127, align 16, !dbg !3892, !tbaa !245, !alias.scope !3890, !noalias !184, !nonnull !0, !enzyme_type !329, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BSMatrix\7B2\2C\202\2C\20ComplexF64\2C\204\7D\7D !0
  %131 = getelementptr inbounds [1 x [4 x [2 x double]]], [1 x [4 x [2 x double]]] addrspace(13)* %arrayptr47171, i64 %129, i64 0, !dbg !3892
  %132 = bitcast [4 x [2 x double]] addrspace(13)* %131 to i8 addrspace(13)*, !dbg !3892
  call void @llvm.memcpy.p13i8.p0i8.i64(i8 addrspace(13)* noundef align 8 dereferenceable(64) %132, i8* noundef nonnull align 8 dereferenceable(64) %128, i64 noundef 64, i1 noundef false) #94, !dbg !3892, !tbaa !462, !alias.scope !2556, !noalias !3748
  %133 = add nuw nsw i64 %iv.next11, 1, !dbg !3893
  %exitcond.not = icmp eq i64 %iv.next11, %umax, !dbg !3896
  br i1 %exitcond.not, label %L392.loopexit1, label %L383, !dbg !3885

L392.loopexit:                                    ; preds = %idxend16
  br label %L392

L392.loopexit1:                                   ; preds = %L383
  br label %L392

L392:                                             ; preds = %L392.loopexit1, %L392.loopexit, %L368, %L72, %L25
  call void @llvm.lifetime.end.p0i8(i64 noundef 48, i8* noundef nonnull %.sub) #94
  ret void, !dbg !3602

oob:                                              ; preds = %L34
  %errorbox = alloca i64, align 8, !dbg !3619
  store i64 %iv.next13, i64* %errorbox, align 8, !dbg !3619, !noalias !3688
  %134 = addrspacecast {} addrspace(10)* %2 to {} addrspace(12)*, !dbg !3619
  call void @ijl_bounds_error_ints({} addrspace(12)* noundef %134, i64* noundef nonnull align 8 %errorbox, i64 noundef 1) #99, !dbg !3619
  unreachable, !dbg !3619

idxend:                                           ; preds = %L34
  %arrayptr162 = load double addrspace(13)*, double addrspace(13)* addrspace(11)* %19, align 16, !dbg !3619, !tbaa !245, !alias.scope !3890, !noalias !184, !nonnull !0, !enzyme_type !329, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0
  %135 = getelementptr inbounds double, double addrspace(13)* %arrayptr162, i64 %22, !dbg !3619
  %arrayref = load double, double addrspace(13)* %135, align 8, !dbg !3619, !tbaa !325, !alias.scope !81, !noalias !84
  call fastcc void @julia_cascade_787([1 x [4 x [2 x double]]]* noalias nocapture nofree noundef nonnull writeonly sret([1 x [4 x [2 x double]]]) align 8 dereferenceable(64) %9, {} addrspace(10)* noundef nonnull align 16 dereferenceable(40) %1, double %arrayref) #94, !dbg !3897
  %arraylen12 = load i64, i64 addrspace(11)* %arraylen_ptr2, align 8, !dbg !3898, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %inbounds13 = icmp ult i64 %22, %arraylen12, !dbg !3898
  br i1 %inbounds13, label %idxend16, label %oob14, !dbg !3898

oob14:                                            ; preds = %idxend
  %errorbox15 = alloca i64, align 8, !dbg !3898
  store i64 %iv.next13, i64* %errorbox15, align 8, !dbg !3898, !noalias !3688
  %136 = addrspacecast {} addrspace(10)* %0 to {} addrspace(12)*, !dbg !3898
  call void @ijl_bounds_error_ints({} addrspace(12)* noundef %136, i64* noundef nonnull align 8 %errorbox15, i64 noundef 1) #99, !dbg !3898
  unreachable, !dbg !3898

idxend16:                                         ; preds = %idxend
  %arrayptr18163 = load [1 x [4 x [2 x double]]] addrspace(13)*, [1 x [4 x [2 x double]]] addrspace(13)* addrspace(11)* %20, align 16, !dbg !3898, !tbaa !245, !alias.scope !3890, !noalias !184, !nonnull !0, !enzyme_type !329, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BSMatrix\7B2\2C\202\2C\20ComplexF64\2C\204\7D\7D !0
  %137 = getelementptr inbounds [1 x [4 x [2 x double]]], [1 x [4 x [2 x double]]] addrspace(13)* %arrayptr18163, i64 %22, i64 0, !dbg !3898
  %138 = bitcast [4 x [2 x double]] addrspace(13)* %137 to i8 addrspace(13)*, !dbg !3898
  call void @llvm.memcpy.p13i8.p0i8.i64(i8 addrspace(13)* noundef align 8 dereferenceable(64) %138, i8* noundef nonnull align 8 dereferenceable(64) %21, i64 noundef 64, i1 noundef false) #94, !dbg !3898, !tbaa !462, !alias.scope !2556, !noalias !3748
  %.not164 = icmp eq i64 %iv.next13, %arraylen, !dbg !3899
  %139 = add nuw nsw i64 %iv.next13, 1, !dbg !3669
  br i1 %.not164, label %L392.loopexit, label %L34, !dbg !3602
}

; Function Attrs: mustprogress willreturn
define internal fastcc void @preprocess_julia_cascade__596({} addrspace(10)* noundef nonnull align 16 dereferenceable(40) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="123247878569488" "enzymejl_parmtype_ref"="2" %0, {} addrspace(10)* noundef nonnull align 16 dereferenceable(40) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="123247705933040" "enzymejl_parmtype_ref"="2" %1, {} addrspace(10)* noundef nonnull align 16 dereferenceable(40) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="123247705933040" "enzymejl_parmtype_ref"="2" %2) unnamed_addr #93 !dbg !3590 {
top:
  %3 = call noalias nonnull dereferenceable(48) dereferenceable_or_null(48) i8* @malloc(i64 48), !enzyme_fromstack !142
  %4 = bitcast i8* %3 to { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }*, !enzyme_caststack !0
  %.sub = bitcast { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }* %4 to i8*
  %5 = call noalias nonnull dereferenceable(64) dereferenceable_or_null(64) i8* @malloc(i64 64), !enzymejl_allocart !2322, !enzyme_type !329, !enzyme_fromstack !89
  %6 = bitcast i8* %5 to [1 x [4 x [2 x double]]]*, !enzyme_caststack !0
  %7 = call noalias nonnull dereferenceable(16) dereferenceable_or_null(16) i8* @malloc(i64 16), !enzyme_fromstack !89
  %newstruct71 = bitcast i8* %7 to { i64, [1 x i64] }*, !enzyme_caststack !0
  %8 = call noalias nonnull dereferenceable(64) dereferenceable_or_null(64) i8* @malloc(i64 64), !enzymejl_allocart !2322, !enzyme_type !329, !enzyme_fromstack !89
  %9 = bitcast i8* %8 to [1 x [4 x [2 x double]]]*, !enzyme_caststack !0
  %10 = call noalias nonnull dereferenceable(64) dereferenceable_or_null(64) i8* @malloc(i64 64), !enzymejl_allocart !2322, !enzyme_type !329, !enzyme_fromstack !89
  %11 = bitcast i8* %10 to [1 x [4 x [2 x double]]]*, !enzyme_caststack !0
  %12 = call {}*** @julia.get_pgcstack() #94
  %current_task1156 = getelementptr inbounds {}**, {}*** %12, i64 -14
  %current_task1 = bitcast {}*** %current_task1156 to {}**
  %ptls_field157 = getelementptr inbounds {}**, {}*** %12, i64 2
  %13 = bitcast {}*** %ptls_field157 to i64***
  %ptls_load158159 = load i64**, i64*** %13, align 8, !tbaa !66
  %14 = getelementptr inbounds i64*, i64** %ptls_load158159, i64 2
  %safepoint = load i64*, i64** %14, align 8, !tbaa !70
  fence syncscope("singlethread") seq_cst
  call void @julia.safepoint(i64* %safepoint) #94, !dbg !3591
  fence syncscope("singlethread") seq_cst
  %15 = call i64 @julia_nthreads_790() #95, !dbg !3592
  %.not = icmp eq i64 %15, 1, !dbg !3594
  br i1 %.not, label %L4, label %L51, !dbg !3595

L4:                                               ; preds = %top
  %16 = addrspacecast {} addrspace(10)* %2 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !3596
  %arraylen_ptr = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %16, i64 0, i32 1, !dbg !3596
  %arraylen = load i64, i64 addrspace(11)* %arraylen_ptr, align 8, !dbg !3596, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %17 = addrspacecast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !3603
  %arraylen_ptr2 = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %17, i64 0, i32 1, !dbg !3603
  %arraylen3 = load i64, i64 addrspace(11)* %arraylen_ptr2, align 8, !dbg !3603, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %.not160 = icmp eq i64 %arraylen, %arraylen3, !dbg !3610
  br i1 %.not160, label %L25, label %L14, !dbg !3609

L14:                                              ; preds = %L4
  %box = call noalias nonnull dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247752959712 to {}*) to {} addrspace(10)*)) #96, !dbg !3609
  %memcpy_refined_dst = bitcast {} addrspace(10)* %box to i64 addrspace(10)*, !dbg !3609, !enzyme_inactive !0
  store i64 %arraylen, i64 addrspace(10)* %memcpy_refined_dst, align 8, !dbg !3609, !tbaa !230, !alias.scope !81, !noalias !3612
  %box31 = call noalias nonnull dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247752959712 to {}*) to {} addrspace(10)*)) #96, !dbg !3609
  %memcpy_refined_dst33 = bitcast {} addrspace(10)* %box31 to i64 addrspace(10)*, !dbg !3609, !enzyme_inactive !0
  store i64 %arraylen3, i64 addrspace(10)* %memcpy_refined_dst33, align 8, !dbg !3609, !tbaa !230, !alias.scope !81, !noalias !3612
  %18 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* noundef nonnull @ijl_invoke, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247893132112 to {}*) to {} addrspace(10)*), {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247774899088 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 123247744295344 to {}*) to {} addrspace(10)*), {} addrspace(10)* nofree nonnull %box, {} addrspace(10)* nofree nonnull %box31) #95, !dbg !3609
  unreachable, !dbg !3609

L25:                                              ; preds = %L4
  %.not161 = icmp eq i64 %arraylen, 0, !dbg !3615
  br i1 %.not161, label %L392, label %L34.preheader, !dbg !3602

L34.preheader:                                    ; preds = %L25
  %19 = addrspacecast {} addrspace(10)* %2 to double addrspace(13)* addrspace(11)*
  %20 = addrspacecast {} addrspace(10)* %0 to [1 x [4 x [2 x double]]] addrspace(13)* addrspace(11)*
  %21 = bitcast [1 x [4 x [2 x double]]]* %9 to i8*
  br label %L34, !dbg !3619

L34:                                              ; preds = %idxend16, %L34.preheader
  %iv12 = phi i64 [ %iv.next13, %idxend16 ], [ 0, %L34.preheader ]
  %iv.next13 = add nuw nsw i64 %iv12, 1, !dbg !3619
  %22 = add nsw i64 %iv.next13, -1, !dbg !3619
  %arraylen10 = load i64, i64 addrspace(11)* %arraylen_ptr, align 8, !dbg !3619, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %inbounds = icmp ult i64 %22, %arraylen10, !dbg !3619
  br i1 %inbounds, label %idxend, label %oob, !dbg !3619

L51:                                              ; preds = %top
  %23 = addrspacecast {} addrspace(10)* %2 to {} addrspace(11)*, !dbg !3621
  %24 = addrspacecast {} addrspace(10)* %2 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !3621
  %arraylen_ptr34 = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %24, i64 0, i32 1, !dbg !3621
  %arraylen35 = load i64, i64 addrspace(11)* %arraylen_ptr34, align 8, !dbg !3621, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %25 = addrspacecast {} addrspace(10)* %0 to {} addrspace(11)*, !dbg !3628
  %26 = addrspacecast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !3628
  %arraylen_ptr37 = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %26, i64 0, i32 1, !dbg !3628
  %arraylen38 = load i64, i64 addrspace(11)* %arraylen_ptr37, align 8, !dbg !3628, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %.not165 = icmp eq i64 %arraylen35, %arraylen38, !dbg !3635
  br i1 %.not165, label %L72, label %L61, !dbg !3634

L61:                                              ; preds = %L51
  %box108 = call noalias nonnull dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247752959712 to {}*) to {} addrspace(10)*)) #96, !dbg !3634
  %memcpy_refined_dst110 = bitcast {} addrspace(10)* %box108 to i64 addrspace(10)*, !dbg !3634, !enzyme_inactive !0
  store i64 %arraylen35, i64 addrspace(10)* %memcpy_refined_dst110, align 8, !dbg !3634, !tbaa !230, !alias.scope !81, !noalias !3612
  %box112 = call noalias nonnull dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247752959712 to {}*) to {} addrspace(10)*)) #96, !dbg !3634
  %memcpy_refined_dst114 = bitcast {} addrspace(10)* %box112 to i64 addrspace(10)*, !dbg !3634, !enzyme_inactive !0
  store i64 %arraylen38, i64 addrspace(10)* %memcpy_refined_dst114, align 8, !dbg !3634, !tbaa !230, !alias.scope !81, !noalias !3612
  %27 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* noundef nonnull @ijl_invoke, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247893132112 to {}*) to {} addrspace(10)*), {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 123247774899088 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 123247744295344 to {}*) to {} addrspace(10)*), {} addrspace(10)* nofree nonnull %box108, {} addrspace(10)* nofree nonnull %box112) #95, !dbg !3634
  unreachable, !dbg !3634

L72:                                              ; preds = %L51
  %.not167 = icmp eq i64 %arraylen35, 0, !dbg !3637
  br i1 %.not167, label %L392, label %L76, !dbg !3639

L76:                                              ; preds = %L72
  %28 = call i64 @llvm.smin.i64(i64 %15, i64 %arraylen35) #94, !dbg !3641
  %.not168 = icmp eq i64 %28, 0, !dbg !3643
  br i1 %.not168, label %L383.lr.ph, label %L84, !dbg !3644

L84:                                              ; preds = %L76
  %29 = trunc i64 %28 to i32, !dbg !3645
  %30 = add i32 %29, -1, !dbg !3645
  %31 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef addrspacecast ({}* inttoptr (i64 123247601534016 to {}*) to {} addrspace(11)*)) #97, !dbg !3649
  %32 = icmp sgt i32 %30, 0, !dbg !3651
  br i1 %32, label %L94, label %L383.lr.ph, !dbg !3652

L94:                                              ; preds = %L84
  %p.i = bitcast {}* %31 to i64*, !dbg !3654
  %v.i = atomicrmw xchg i64* %p.i, i64 0 acq_rel, align 8, !dbg !3654
  %33 = call i64 @llvm.ctpop.i64(i64 %v.i) #94, !dbg !3657, !range !2422
  %34 = trunc i64 %33 to i32, !dbg !3659
  %35 = sub nsw i32 %30, %34, !dbg !3660
  %36 = icmp slt i32 %35, 0, !dbg !3662
  br i1 %36, label %L107, label %L142, !dbg !3665

L107:                                             ; preds = %L94
  %37 = call i64 @llvm.ctlz.i64(i64 %v.i, i1 noundef false) #94, !dbg !3666, !range !2422
  %38 = trunc i64 %37 to i32, !dbg !3668
  br label %L110, !dbg !3669

L110:                                             ; preds = %L110, %L107
  %iv = phi i64 [ %iv.next, %L110 ], [ 0, %L107 ]
  %value_phi95 = phi i32 [ %38, %L107 ], [ %39, %L110 ]
  %value_phi96 = phi i32 [ %35, %L107 ], [ %48, %L110 ]
  %value_phi97 = phi i64 [ %v.i, %L107 ], [ %44, %L110 ]
  %iv.next = add nuw nsw i64 %iv, 1, !dbg !3670
  %39 = sub i32 %value_phi95, %value_phi96, !dbg !3670
  %40 = sub i32 64, %39, !dbg !3672
  %41 = zext i32 %40 to i64, !dbg !3674
  %42 = icmp ugt i32 %40, 63, !dbg !3674
  %notmask = shl nsw i64 -1, %41, !dbg !3672
  %.op = xor i64 %notmask, -1, !dbg !3672
  %43 = select i1 %42, i64 -1, i64 %.op, !dbg !3672
  %44 = and i64 %43, %value_phi97, !dbg !3675
  %45 = xor i64 %44, %value_phi97, !dbg !3677
  %46 = call i64 @llvm.ctpop.i64(i64 %45) #94, !dbg !3678, !range !2422
  %47 = trunc i64 %46 to i32, !dbg !3680
  %48 = add i32 %value_phi96, %47, !dbg !3681
  %.not176 = icmp eq i32 %48, 0, !dbg !3682
  br i1 %.not176, label %L131, label %L110, !dbg !3683

L131:                                             ; preds = %L110
  %49 = xor i64 %44, -1, !dbg !3684
  %50 = and i64 %v.i, %49, !dbg !3686
  store atomic i64 %50, i64* %p.i release, align 16, !dbg !3687, !noalias !3688
  br label %L142, !dbg !3669

L142:                                             ; preds = %L131, %L94
  %value_phi48 = phi i32 [ %30, %L131 ], [ %34, %L94 ]
  %value_phi49 = phi i64 [ %44, %L131 ], [ %v.i, %L94 ]
  %51 = icmp sgt i32 %value_phi48, 0, !dbg !3689
  br i1 %51, label %L198.lr.ph, label %L383.lr.ph, !dbg !3690

L198.lr.ph:                                       ; preds = %L142
  %52 = zext i32 %value_phi48 to i64, !dbg !3691
  %53 = add nuw nsw i64 %52, 1, !dbg !3708
  %54 = udiv i64 %arraylen35, %53, !dbg !3710
  %55 = mul i64 %54, %53, !dbg !3711
  %56 = sub i64 %arraylen35, %55, !dbg !3713
  %57 = call nonnull "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" {}* @julia.pointer_from_objref({} addrspace(11)* noundef %23) #97, !dbg !3714
  %58 = bitcast {}* %57 to i8**, !dbg !3714
  %arrayptr52 = load i8*, i8** %58, align 8, !dbg !3714, !tbaa !245, !alias.scope !183, !noalias !184, !nonnull !0, !enzyme_type !329, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0
  %59 = ptrtoint i8* %arrayptr52 to i64, !dbg !3714
  %arraylen54 = load i64, i64 addrspace(11)* %arraylen_ptr34, align 8, !dbg !3724, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %60 = call nonnull "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" {}* @julia.pointer_from_objref({} addrspace(11)* noundef %25) #97, !dbg !3730
  %61 = bitcast {}* %60 to i8**, !dbg !3730
  %arrayptr59 = load i8*, i8** %61, align 8, !dbg !3730, !tbaa !245, !alias.scope !183, !noalias !184, !nonnull !0, !enzyme_type !329, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BSMatrix\7B2\2C\202\2C\20ComplexF64\2C\204\7D\7D !0
  %62 = ptrtoint i8* %arrayptr59 to i64, !dbg !3730
  %arraylen61 = load i64, i64 addrspace(11)* %arraylen_ptr37, align 8, !dbg !3736, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %63 = addrspacecast {} addrspace(10)* %1 to {} addrspace(11)*, !dbg !3714
  %64 = call nonnull "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" {}* @julia.pointer_from_objref({} addrspace(11)* noundef %63) #97, !dbg !3714
  %65 = bitcast {}* %64 to i8**, !dbg !3714
  %arrayptr66 = load i8*, i8** %65, align 8, !dbg !3714, !tbaa !245, !alias.scope !183, !noalias !184, !nonnull !0, !enzyme_type !329, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0
  %66 = ptrtoint i8* %arrayptr66 to i64, !dbg !3714
  %67 = addrspacecast {} addrspace(10)* %1 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !3724
  %arraylen_ptr67 = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %67, i64 0, i32 1, !dbg !3724
  %arraylen68 = load i64, i64 addrspace(11)* %arraylen_ptr67, align 8, !dbg !3724, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %68 = getelementptr inbounds { i64, [1 x i64] }, { i64, [1 x i64] }* %newstruct71, i64 0, i32 1, i64 0, !dbg !3742
  store i64 %arraylen68, i64* %68, align 8, !dbg !3742, !tbaa !94, !alias.scope !96, !noalias !3743
  %69 = getelementptr inbounds { i64, [1 x i64] }, { i64, [1 x i64] }* %newstruct71, i64 0, i32 0, !dbg !3744
  store i64 %66, i64* %69, align 8, !dbg !3744, !tbaa !94, !alias.scope !96, !noalias !3743
  %newstruct72.sroa.4.32..sroa_cast = bitcast { i64, [1 x i64] }* %newstruct71 to i8*, !dbg !3721
  %newstruct72.sroa.0.sroa.0.0.newstruct72.sroa.0.0..sroa_cast.sroa_cast = getelementptr inbounds { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }, { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }* %4, i64 0, i32 0, i32 0, !dbg !3747
  store i64 %59, i64* %newstruct72.sroa.0.sroa.0.0.newstruct72.sroa.0.0..sroa_cast.sroa_cast, align 16, !dbg !3747, !tbaa !462, !alias.scope !2556, !noalias !3748
  %newstruct72.sroa.0.sroa.2.0.newstruct72.sroa.0.0..sroa_cast.sroa_cast = getelementptr inbounds { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }, { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }* %4, i64 0, i32 0, i32 1, i64 0, !dbg !3747
  store i64 %arraylen54, i64* %newstruct72.sroa.0.sroa.2.0.newstruct72.sroa.0.0..sroa_cast.sroa_cast, align 8, !dbg !3747, !tbaa !462, !alias.scope !2556, !noalias !3748
  %newstruct72.sroa.2.0..sroa_idx128 = getelementptr inbounds { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }, { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }* %4, i64 0, i32 1, i32 0, !dbg !3747
  store i64 %62, i64* %newstruct72.sroa.2.0..sroa_idx128, align 16, !dbg !3747, !tbaa !462, !alias.scope !2556, !noalias !3748
  %newstruct72.sroa.3.0..sroa_idx129 = getelementptr inbounds { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }, { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }* %4, i64 0, i32 1, i32 1, i64 0, !dbg !3747
  store i64 %arraylen61, i64* %newstruct72.sroa.3.0..sroa_idx129, align 8, !dbg !3747, !tbaa !462, !alias.scope !2556, !noalias !3748
  %newstruct72.sroa.4.0..sroa_idx = getelementptr inbounds { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }, { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }* %4, i64 0, i32 2, !dbg !3747
  %newstruct72.sroa.4.0..sroa_cast = bitcast { i64, [1 x i64] }* %newstruct72.sroa.4.0..sroa_idx to i8*, !dbg !3747
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 16 dereferenceable(16) %newstruct72.sroa.4.0..sroa_cast, i8* noundef nonnull align 8 dereferenceable(16) %newstruct72.sroa.4.32..sroa_cast, i64 noundef 16, i1 noundef false) #94, !dbg !3747
  %70 = call token (...) @llvm.julia.gc_preserve_begin({} addrspace(10)* nonnull %2, {} addrspace(10)* nonnull %0, {} addrspace(10)* nonnull %1) #94, !dbg !3721
  %71 = icmp sgt i64 %56, -1
  br label %L198, !dbg !3749

L264.preheader:                                   ; preds = %L252
  %value_phi84189 = add i64 %83, 1, !dbg !3750
  %.not173190 = icmp sgt i64 %value_phi84189, %arraylen35, !dbg !3751
  br i1 %.not173190, label %L331.preheader, label %L281.lr.ph, !dbg !3752

L281.lr.ph:                                       ; preds = %L264.preheader
  %72 = addrspacecast { i64, [1 x i64] }* %newstruct71 to { i64, [1 x i64] } addrspace(11)*
  %73 = bitcast [1 x [4 x [2 x double]]]* %11 to i8*
  %74 = add i64 %54, %value_phi76194, !dbg !3754
  %umin = call i1 @llvm.umin.i1(i1 %80, i1 %71), !dbg !3752
  %75 = zext i1 %umin to i64, !dbg !3752
  %76 = add i64 %74, %75, !dbg !3754
  br label %L281, !dbg !3752

L198:                                             ; preds = %L252, %L198.lr.ph
  %iv2 = phi i64 [ %iv.next3, %L252 ], [ 0, %L198.lr.ph ]
  %value_phi78196 = phi i64 [ %value_phi49, %L198.lr.ph ], [ %89, %L252 ]
  %value_phi76194 = phi i64 [ 0, %L198.lr.ph ], [ %83, %L252 ]
  %value_phi75193 = phi i32 [ 0, %L198.lr.ph ], [ %85, %L252 ]
  %iv.next3 = add nuw nsw i64 %iv2, 1, !dbg !3760
  %77 = icmp ne i64 %value_phi78196, 0, !dbg !3760
  call void @llvm.assume(i1 noundef %77) #94, !dbg !3763
  %78 = call i64 @llvm.cttz.i64(i64 %value_phi78196, i1 noundef true) #94, !dbg !3764, !range !2422
  %79 = trunc i64 %78 to i32, !dbg !3766
  %80 = icmp ugt i64 %56, %iv2, !dbg !3767
  %not.ifelse_cond79 = and i1 %71, %80, !dbg !3771
  %81 = zext i1 %not.ifelse_cond79 to i64, !dbg !3771
  %82 = add i64 %value_phi76194, %54, !dbg !3771
  %83 = add i64 %82, %81, !dbg !3772
  %84 = add nuw nsw i32 %79, 1, !dbg !3773
  %85 = add i32 %84, %value_phi75193, !dbg !3775
  %86 = zext i32 %84 to i64, !dbg !3777
  %87 = lshr i64 %value_phi78196, %86, !dbg !3777
  %88 = icmp eq i32 %79, 63, !dbg !3777
  %89 = select i1 %88, i64 0, i64 %87, !dbg !3777
  %90 = load i64, i64* inttoptr (i64 123246368836896 to i64*), align 32, !dbg !3779, !tbaa !77, !alias.scope !81, !noalias !84
  %91 = shl i32 %85, 9, !dbg !3785
  %92 = zext i32 %91 to i64, !dbg !3786
  %93 = inttoptr i64 %90 to i8*, !dbg !3790
  %94 = getelementptr i8, i8* %93, i64 %92, !dbg !3790
  %95 = getelementptr i8, i8* %94, i64 8, !dbg !3791
  %coercion = bitcast i8* %95 to i64*, !dbg !3797
  store i64 ptrtoint (void (i64)* @jlcapi_BatchClosure_600 to i64), i64* %coercion, align 1, !dbg !3797, !tbaa !134, !alias.scope !81, !noalias !3612
  %96 = getelementptr i8, i8* %94, i64 16, !dbg !3801
  %97 = bitcast i8* %96 to { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }**, !dbg !3805
  store { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }* %4, { { i64, [1 x i64] }, { i64, [1 x i64] }, { i64, [1 x i64] } }** %97, align 1, !dbg !3805, !tbaa !134, !alias.scope !81, !noalias !3612
  %98 = getelementptr i8, i8* %94, i64 24, !dbg !3809
  %coercion81 = bitcast i8* %98 to i64*, !dbg !3813
  store i64 %value_phi76194, i64* %coercion81, align 1, !dbg !3813, !tbaa !134, !alias.scope !81, !noalias !3612
  %99 = getelementptr i8, i8* %94, i64 32, !dbg !3817
  %coercion82 = bitcast i8* %99 to i64*, !dbg !3821
  store i64 %83, i64* %coercion82, align 1, !dbg !3821, !tbaa !134, !alias.scope !81, !noalias !3612
  %p.i119 = bitcast i8* %94 to i32*, !dbg !3825
  %v.i120 = atomicrmw xchg i32* %p.i119, i32 0 acq_rel, align 4, !dbg !3825
  %.not172 = icmp eq i32 %v.i120, 1, !dbg !3828
  br i1 %.not172, label %L249, label %L252, !dbg !3829

L249:                                             ; preds = %L198
  call fastcc void @julia_wake_thread__782(i32 zeroext %85) #94, !dbg !3829
  br label %L252, !dbg !3829

L252:                                             ; preds = %L249, %L198
  %100 = icmp eq i64 %iv.next3, %52, !dbg !3830
  br i1 %100, label %L264.preheader, label %L198, !dbg !3749

L331.preheader.loopexit:                          ; preds = %L281
  br label %L331.preheader, !dbg !3832

L331.preheader:                                   ; preds = %L331.preheader.loopexit, %L264.preheader
  %101 = icmp eq i64 %value_phi49, 0, !dbg !3832
  br i1 %101, label %L368, label %L336.preheader, !dbg !3834

L336.preheader:                                   ; preds = %L331.preheader
  br label %L336, !dbg !3835

L281:                                             ; preds = %L281, %L281.lr.ph
  %iv4 = phi i64 [ %iv.next5, %L281 ], [ 0, %L281.lr.ph ]
  %102 = add i64 %76, %iv4, !dbg !3754
  %iv.next5 = add nuw nsw i64 %iv4, 1, !dbg !3754
  %103 = add i64 %value_phi84189, %iv4, !dbg !3754
  %104 = shl i64 %102, 3, !dbg !3754
  %105 = getelementptr i8, i8* %arrayptr52, i64 %104, !dbg !3838
  %coercion86 = bitcast i8* %105 to double*, !dbg !3839
  %pointerref = load double, double* %coercion86, align 1, !dbg !3839, !tbaa !134, !alias.scope !81, !noalias !84
  call void @llvm.lifetime.end.p0i8(i64 noundef 48, i8* noundef nonnull %.sub) #94
  call fastcc void @julia_cascade_770([1 x [4 x [2 x double]]]* noalias nocapture nofree noundef nonnull writeonly sret([1 x [4 x [2 x double]]]) align 8 dereferenceable(64) %11, { i64, [1 x i64] } addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(16) %72, double %pointerref) #94, !dbg !3843
  %106 = shl i64 %102, 6, !dbg !3844
  %coercion88 = getelementptr i8, i8* %arrayptr59, i64 %106, !dbg !3848
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 1 dereferenceable(64) %coercion88, i8* noundef nonnull align 8 dereferenceable(64) %73, i64 noundef 64, i1 noundef false) #94, !dbg !3849, !noalias !3688
  %value_phi84 = add i64 %103, 1, !dbg !3750
  %exitcond200 = icmp eq i64 %103, %arraylen35, !dbg !3751
  br i1 %exitcond200, label %L331.preheader.loopexit, label %L281, !dbg !3752

L336:                                             ; preds = %L336.preheader, %L366
  %iv6 = phi i64 [ 0, %L336.preheader ], [ %iv.next7, %L366 ]
  %value_phi92188 = phi i64 [ %111, %L366 ], [ %value_phi49, %L336.preheader ]
  %value_phi91187 = phi i32 [ %113, %L366 ], [ 0, %L336.preheader ]
  %iv.next7 = add nuw nsw i64 %iv6, 1, !dbg !3853
  %107 = call i64 @llvm.cttz.i64(i64 %value_phi92188, i1 noundef true) #94, !dbg !3853, !range !2422
  %108 = trunc i64 %107 to i32, !dbg !3855
  %109 = add nuw nsw i32 %108, 1, !dbg !3856
  %110 = zext i32 %109 to i64, !dbg !3858
  %111 = lshr i64 %value_phi92188, %110, !dbg !3858
  %112 = icmp eq i32 %108, 63, !dbg !3858
  %113 = add i32 %109, %value_phi91187, !dbg !3860
  %114 = load i64, i64* inttoptr (i64 123246368836896 to i64*), align 32, !dbg !3862, !tbaa !77, !alias.scope !81, !noalias !84
  %115 = shl i32 %113, 9, !dbg !3865
  %116 = zext i32 %115 to i64, !dbg !3866
  %117 = inttoptr i64 %114 to i8*, !dbg !3870
  %118 = getelementptr i8, i8* %117, i64 %116, !dbg !3870
  %p.i121 = bitcast i8* %118 to i32*, !dbg !3871
  %v.i122184 = load atomic i32, i32* %p.i121 acquire, align 16, !dbg !3871
  %.not174185 = icmp eq i32 %v.i122184, 0, !dbg !3873
  br i1 %.not174185, label %L356.preheader, label %L366, !dbg !3835

L356.preheader:                                   ; preds = %L336
  br label %L356, !dbg !3874

L356:                                             ; preds = %L356.preheader, %L363
  %iv8 = phi i64 [ 0, %L356.preheader ], [ %iv.next9, %L363 ]
  %119 = trunc i64 %iv8 to i32
  %iv.next9 = add nuw nsw i64 %iv8, 1
  call void @llvm.lifetime.end.p0i8(i64 noundef 48, i8* noundef nonnull %.sub) #94
  call void asm sideeffect "pause", "~{memory}"() #98, !dbg !3875
  %120 = add i32 %119, 1, !dbg !3877
  %121 = icmp ult i32 %120, 65537, !dbg !3878
  br i1 %121, label %L363, label %L360, !dbg !3874

L360:                                             ; preds = %L356
  %122 = call fastcc i8 @julia_checktask_612(i32 zeroext %113) #94, !dbg !3880
  %123 = and i8 %122, 1, !dbg !3880
  %.not175 = icmp eq i8 %123, 0, !dbg !3880
  br i1 %.not175, label %L363, label %L366.loopexit, !dbg !3880

L363:                                             ; preds = %L360, %L356
  %v.i122 = load atomic i32, i32* %p.i121 acquire, align 16, !dbg !3871
  %.not174 = icmp eq i32 %v.i122, 0, !dbg !3873
  br i1 %.not174, label %L356, label %L366.loopexit, !dbg !3835

L366.loopexit:                                    ; preds = %L360, %L363
  br label %L366, !dbg !3832

L366:                                             ; preds = %L366.loopexit, %L336
  %124 = icmp eq i64 %111, 0, !dbg !3832
  %125 = select i1 %112, i1 true, i1 %124, !dbg !3832
  br i1 %125, label %L368.loopexit, label %L336, !dbg !3834

L368.loopexit:                                    ; preds = %L366
  br label %L368, !dbg !3881

L368:                                             ; preds = %L368.loopexit, %L331.preheader
  %v.i118 = atomicrmw or i64* %p.i, i64 %value_phi49 acq_rel, align 8, !dbg !3881
  call void @llvm.julia.gc_preserve_end(token %70) #94, !dbg !3884
  br label %L392, !dbg !3884

L383.lr.ph:                                       ; preds = %L142, %L84, %L76
  %126 = addrspacecast {} addrspace(10)* %2 to double addrspace(13)* addrspace(11)*
  %127 = addrspacecast {} addrspace(10)* %0 to [1 x [4 x [2 x double]]] addrspace(13)* addrspace(11)*
  %128 = bitcast [1 x [4 x [2 x double]]]* %6 to i8*
  %umax = call i64 @llvm.umax.i64(i64 %arraylen35, i64 noundef 1) #94, !dbg !3885
  br label %L383, !dbg !3885

L383:                                             ; preds = %L383, %L383.lr.ph
  %iv10 = phi i64 [ %iv.next11, %L383 ], [ 0, %L383.lr.ph ]
  %iv.next11 = add nuw nsw i64 %iv10, 1, !dbg !3887
  %129 = add nsw i64 %iv.next11, -1, !dbg !3887
  %arrayptr44170 = load double addrspace(13)*, double addrspace(13)* addrspace(11)* %126, align 16, !dbg !3887, !tbaa !245, !alias.scope !3890, !noalias !184, !nonnull !0, !enzyme_type !329, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0
  %130 = getelementptr inbounds double, double addrspace(13)* %arrayptr44170, i64 %129, !dbg !3887
  %arrayref45 = load double, double addrspace(13)* %130, align 8, !dbg !3887, !tbaa !325, !alias.scope !81, !noalias !84
  call fastcc void @julia_cascade_787([1 x [4 x [2 x double]]]* noalias nocapture nofree noundef nonnull writeonly sret([1 x [4 x [2 x double]]]) align 8 dereferenceable(64) %6, {} addrspace(10)* noundef nonnull align 16 dereferenceable(40) %1, double %arrayref45) #94, !dbg !3891
  %arrayptr47171 = load [1 x [4 x [2 x double]]] addrspace(13)*, [1 x [4 x [2 x double]]] addrspace(13)* addrspace(11)* %127, align 16, !dbg !3892, !tbaa !245, !alias.scope !3890, !noalias !184, !nonnull !0, !enzyme_type !329, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BSMatrix\7B2\2C\202\2C\20ComplexF64\2C\204\7D\7D !0
  %131 = getelementptr inbounds [1 x [4 x [2 x double]]], [1 x [4 x [2 x double]]] addrspace(13)* %arrayptr47171, i64 %129, i64 0, !dbg !3892
  %132 = bitcast [4 x [2 x double]] addrspace(13)* %131 to i8 addrspace(13)*, !dbg !3892
  call void @llvm.memcpy.p13i8.p0i8.i64(i8 addrspace(13)* noundef align 8 dereferenceable(64) %132, i8* noundef nonnull align 8 dereferenceable(64) %128, i64 noundef 64, i1 noundef false) #94, !dbg !3892, !tbaa !462, !alias.scope !2556, !noalias !3748
  %133 = add nuw nsw i64 %iv.next11, 1, !dbg !3893
  %exitcond.not = icmp eq i64 %iv.next11, %umax, !dbg !3896
  br i1 %exitcond.not, label %L392.loopexit1, label %L383, !dbg !3885

L392.loopexit:                                    ; preds = %idxend16
  br label %L392

L392.loopexit1:                                   ; preds = %L383
  br label %L392

L392:                                             ; preds = %L392.loopexit1, %L392.loopexit, %L368, %L72, %L25
  call void @llvm.lifetime.end.p0i8(i64 noundef 48, i8* noundef nonnull %.sub) #94
  ret void, !dbg !3602

oob:                                              ; preds = %L34
  %errorbox = alloca i64, align 8, !dbg !3619
  store i64 %iv.next13, i64* %errorbox, align 8, !dbg !3619, !noalias !3688
  %134 = addrspacecast {} addrspace(10)* %2 to {} addrspace(12)*, !dbg !3619
  call void @ijl_bounds_error_ints({} addrspace(12)* noundef %134, i64* noundef nonnull align 8 %errorbox, i64 noundef 1) #99, !dbg !3619
  unreachable, !dbg !3619

idxend:                                           ; preds = %L34
  %arrayptr162 = load double addrspace(13)*, double addrspace(13)* addrspace(11)* %19, align 16, !dbg !3619, !tbaa !245, !alias.scope !3890, !noalias !184, !nonnull !0, !enzyme_type !329, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0
  %135 = getelementptr inbounds double, double addrspace(13)* %arrayptr162, i64 %22, !dbg !3619
  %arrayref = load double, double addrspace(13)* %135, align 8, !dbg !3619, !tbaa !325, !alias.scope !81, !noalias !84
  call fastcc void @julia_cascade_787([1 x [4 x [2 x double]]]* noalias nocapture nofree noundef nonnull writeonly sret([1 x [4 x [2 x double]]]) align 8 dereferenceable(64) %9, {} addrspace(10)* noundef nonnull align 16 dereferenceable(40) %1, double %arrayref) #94, !dbg !3897
  %arraylen12 = load i64, i64 addrspace(11)* %arraylen_ptr2, align 8, !dbg !3898, !tbaa !179, !range !182, !alias.scope !183, !noalias !184, !enzyme_type !271, !enzyme_inactive !0, !enzymejl_source_type_UInt64 !0, !enzymejl_byref_BITS_VALUE !0
  %inbounds13 = icmp ult i64 %22, %arraylen12, !dbg !3898
  br i1 %inbounds13, label %idxend16, label %oob14, !dbg !3898

oob14:                                            ; preds = %idxend
  %errorbox15 = alloca i64, align 8, !dbg !3898
  store i64 %iv.next13, i64* %errorbox15, align 8, !dbg !3898, !noalias !3688
  %136 = addrspacecast {} addrspace(10)* %0 to {} addrspace(12)*, !dbg !3898
  call void @ijl_bounds_error_ints({} addrspace(12)* noundef %136, i64* noundef nonnull align 8 %errorbox15, i64 noundef 1) #99, !dbg !3898
  unreachable, !dbg !3898

idxend16:                                         ; preds = %idxend
  %arrayptr18163 = load [1 x [4 x [2 x double]]] addrspace(13)*, [1 x [4 x [2 x double]]] addrspace(13)* addrspace(11)* %20, align 16, !dbg !3898, !tbaa !245, !alias.scope !3890, !noalias !184, !nonnull !0, !enzyme_type !329, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BSMatrix\7B2\2C\202\2C\20ComplexF64\2C\204\7D\7D !0
  %137 = getelementptr inbounds [1 x [4 x [2 x double]]], [1 x [4 x [2 x double]]] addrspace(13)* %arrayptr18163, i64 %22, i64 0, !dbg !3898
  %138 = bitcast [4 x [2 x double]] addrspace(13)* %137 to i8 addrspace(13)*, !dbg !3898
  call void @llvm.memcpy.p13i8.p0i8.i64(i8 addrspace(13)* noundef align 8 dereferenceable(64) %138, i8* noundef nonnull align 8 dereferenceable(64) %21, i64 noundef 64, i1 noundef false) #94, !dbg !3898, !tbaa !462, !alias.scope !2556, !noalias !3748
  %.not164 = icmp eq i64 %iv.next13, %arraylen, !dbg !3899
  %139 = add nuw nsw i64 %iv.next13, 1, !dbg !3669
  br i1 %.not164, label %L392.loopexit, label %L34, !dbg !3602
}

  %v.i = atomicrmw xchg i64* %p.i, i64 0 acq_rel, align 8, !dbg !193
 Active atomic inst not yet handled

Stacktrace:
 [1] _atomic_xchg!
   @ ~/.local/share/julia/packages/ThreadingUtilities/3z3g0/src/atomics.jl:33
 [2] _exchange_mask!
   @ ~/.local/share/julia/packages/PolyesterWeave/E9Wdf/src/request.jl:67
 [3] __request_threads
   @ ~/.local/share/julia/packages/PolyesterWeave/E9Wdf/src/request.jl:89
 [4] _request_threads
   @ ~/.local/share/julia/packages/PolyesterWeave/E9Wdf/src/request.jl:61
 [5] request_threads
   @ ~/.local/share/julia/packages/PolyesterWeave/E9Wdf/src/request.jl:121
 [6] request_threads
   @ ~/.local/share/julia/packages/PolyesterWeave/E9Wdf/src/request.jl:128
 [7] batch
   @ ~/.local/share/julia/packages/Polyester/eqrC9/src/batch.jl:308
 [8] macro expansion
   @ ~/.local/share/julia/packages/Polyester/eqrC9/src/closure.jl:456
 [9] cascade!
   @ ~/Dropbox/Projects/Julia/wblna_optim_code_for_paper/src/enzyme_mwe.jl:19


Stacktrace:
  [1] count_ones
    @ ./int.jl:415 [inlined]
  [2] __request_threads
    @ ~/.local/share/julia/packages/PolyesterWeave/E9Wdf/src/request.jl:90 [inlined]
  [3] _request_threads
    @ ~/.local/share/julia/packages/PolyesterWeave/E9Wdf/src/request.jl:61 [inlined]
  [4] request_threads
    @ ~/.local/share/julia/packages/PolyesterWeave/E9Wdf/src/request.jl:121 [inlined]
  [5] request_threads
    @ ~/.local/share/julia/packages/PolyesterWeave/E9Wdf/src/request.jl:128 [inlined]
  [6] batch
    @ ~/.local/share/julia/packages/Polyester/eqrC9/src/batch.jl:308 [inlined]
  [7] macro expansion
    @ ~/.local/share/julia/packages/Polyester/eqrC9/src/closure.jl:456 [inlined]
  [8] cascade!
    @ ~/Dropbox/Projects/Julia/wblna_optim_code_for_paper/src/enzyme_mwe.jl:19
  [9] objective!
    @ ~/Dropbox/Projects/Julia/wblna_optim_code_for_paper/src/enzyme_mwe.jl:27 [inlined]
 [10] diffejulia_objective__578wrap
    @ ~/Dropbox/Projects/Julia/wblna_optim_code_for_paper/src/enzyme_mwe.jl:0
 [11] macro expansion
    @ ~/.local/share/julia/packages/Enzyme/ydGh2/src/compiler.jl:5218 [inlined]
 [12] enzyme_call
    @ ~/.local/share/julia/packages/Enzyme/ydGh2/src/compiler.jl:4764 [inlined]
 [13] CombinedAdjointThunk
    @ ~/.local/share/julia/packages/Enzyme/ydGh2/src/compiler.jl:4636 [inlined]
 [14] autodiff
    @ ~/.local/share/julia/packages/Enzyme/ydGh2/src/Enzyme.jl:503 [inlined]
 [15] autodiff
    @ ~/.local/share/julia/packages/Enzyme/ydGh2/src/Enzyme.jl:524 [inlined]
 [16] d_objective!(dzs::Vector{Float64}, zs::Vector{Float64}, A::Vector{SMatrix{2, 2, ComplexF64, 4}}, shadow_A::Vector{SMatrix{2, 2, ComplexF64, 4}}, L::Float64, freqs::Vector{Float64})
    @ Main ~/Dropbox/Projects/Julia/wblna_optim_code_for_paper/src/enzyme_mwe.jl:35
 [17] var"##core#233"()
    @ Main ~/.local/share/julia/packages/BenchmarkTools/QNsku/src/execution.jl:561
 [18] var"##sample#234"(::Tuple{}, __params::BenchmarkTools.Parameters)
    @ Main ~/.local/share/julia/packages/BenchmarkTools/QNsku/src/execution.jl:570
 [19] _lineartrial(b::BenchmarkTools.Benchmark, p::BenchmarkTools.Parameters; maxevals::Int64, kwargs::@Kwargs{})
    @ BenchmarkTools ~/.local/share/julia/packages/BenchmarkTools/QNsku/src/execution.jl:187
 [20] _lineartrial(b::BenchmarkTools.Benchmark, p::BenchmarkTools.Parameters)
    @ BenchmarkTools ~/.local/share/julia/packages/BenchmarkTools/QNsku/src/execution.jl:182
 [21] #invokelatest#2
    @ ./essentials.jl:892 [inlined]
 [22] invokelatest
    @ ./essentials.jl:889 [inlined]
 [23] #lineartrial#46
    @ ~/.local/share/julia/packages/BenchmarkTools/QNsku/src/execution.jl:51 [inlined]
 [24] lineartrial
    @ ~/.local/share/julia/packages/BenchmarkTools/QNsku/src/execution.jl:50 [inlined]
 [25] tune!(b::BenchmarkTools.Benchmark, p::BenchmarkTools.Parameters; progressid::Nothing, nleaves::Float64, ndone::Float64, verbose::Bool, pad::String, kwargs::@Kwargs{})
    @ BenchmarkTools ~/.local/share/julia/packages/BenchmarkTools/QNsku/src/execution.jl:300
 [26] tune!
    @ ~/.local/share/julia/packages/BenchmarkTools/QNsku/src/execution.jl:289 [inlined]
 [27] tune!(b::BenchmarkTools.Benchmark)
    @ BenchmarkTools ~/.local/share/julia/packages/BenchmarkTools/QNsku/src/execution.jl:289
 [28] top-level scope
    @ ~/.local/share/julia/packages/BenchmarkTools/QNsku/src/execution.jl:447
in expression starting at /home/kiran/Dropbox/Projects/Julia/wblna_optim_code_for_paper/src/enzyme_mwe.jl:49

@wsmoses
Copy link
Member

wsmoses commented Dec 22, 2024

The first error we definitely should fix to be nicer, but re the 1.10 error, I think that's equivalent to #2208

@wsmoses
Copy link
Member

wsmoses commented Dec 23, 2024

I believe this should now be resolved on main by the jll bump, please reopen if not!

@wsmoses wsmoses closed this as completed Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants