-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turing crash #2256
Comments
Just because GC errors are massively tricky (and often bugs in Julia itself), would you be able get rid of the distributions dep? |
Progress, though still not very minimal. Now only depends on DPPL and Distributions: module MWE
using Distributions
using Enzyme
using Turing: DynamicPPL
using ADTypes
using Random
Random.seed!(23)
mode = Enzyme.WithPrimal(Enzyme.set_runtime_activity(Enzyme.Reverse))
function data_poly(x)
return hcat([x]...)
end
function model(__model__, __varinfo__, __context__, x::Any;)
X = data_poly(x)
(var"##value#229", __varinfo__) = (DynamicPPL.tilde_assume!!)(__context__, Normal(0, 1), DynamicPPL.VarName{:σ}(), __varinfo__)
var"##retval#231" = for i = 1:1000
mu = X[i, 1]
dist = Uniform(mu, mu + 1.0)
DynamicPPL.tilde_observe!!(__context__, dist, 0.0, __varinfo__)
end
return (0.0, __varinfo__)
end
function model(x::Any;)
return DynamicPPL.Model(model, NamedTuple{(:x,)}((x,));)
end
struct EnzymeGradientLogDensity{L,M<:Union{Enzyme.ForwardMode,Enzyme.ReverseMode},S}
ℓ::L
mode::M
shadow::S # only used in forward mode
end
function logdensity_and_gradient(ldf, x)
∂ℓ_∂x = zero(x)
_, y = Enzyme.autodiff(mode, logdensity, Enzyme.Active,
Enzyme.Const(ldf), Enzyme.Duplicated(x, ∂ℓ_∂x))
y, ∂ℓ_∂x
end
struct LogDensityFunction{V,M,C}
"varinfo used for evaluation"
varinfo::V
"model used for evaluation"
model::M
"context used for evaluation; if `nothing`, `leafcontext(model.context)` will be used when applicable"
context::C
end
function logdensity(f::LogDensityFunction, θ::AbstractVector)
context = f.context
vi_new = DynamicPPL.unflatten(f.varinfo, context, θ)
return DynamicPPL.getlogp(last(DynamicPPL.evaluate!!(f.model, vi_new, context)))
end
function initialstep( model, vi)
ldf = LogDensityFunction(
vi,
model,
DynamicPPL.leafcontext(model.context),
)
∂logπ∂θ(x) = logdensity_and_gradient(ldf, x)
# function logp(x)
# vi = DynamicPPL.unflatten(vi, x)
# logp = DynamicPPL.evaluate!!(model, vi, DynamicPPL.SamplingContext(Random.default_rng(), DynamicPPL.SampleFromPrior(), DynamicPPL.leafcontext(model.context)),)[2].logp[]
# return logp
# end
#
# function ∂logπ∂θ(x)
# return Enzyme.autodiff(mode, Enzyme.Const(logp), Enzyme.Active, Enzyme.Duplicated(x, zero(x)))
# end
hamiltonian = (; ∂ℓπ∂θ=∂logπ∂θ)
return (; hamiltonian=hamiltonian)
end
x = rand(Normal(0, 1.5), 1000)
m = model(x)
vi_original = DynamicPPL.VarInfo(m)
state = initialstep(
m,
vi_original;
)
theta = [1.2260841057562286]
for _ in 1:5000
state.hamiltonian.∂ℓπ∂θ(theta)
end
end Will have to attend to other stuff now. |
bumping here @mhauru if you have a chance to reduce further? |
Okay, finally a version that doesn't depend on anything outside of stdlib and Enzyme: module MWE
using Enzyme
using Random
Random.seed!(23)
struct Uniform{T<:Real}
a::T
b::T
Uniform{T}(a::Real, b::Real) where {T<:Real} = new{T}(a, b)
end
function Uniform(a::T, b::T; check_args::Bool=true) where {T<:Real}
return Uniform{T}(a, b)
end
function insupport(d::Uniform, x::Real)
return d.a <= x <= d.b
end
function logpdf(d::Uniform, x::Real)
a, b, _ = promote(d.a, d.b, x)
val = -log(b - a)
return insupport(d, x) ? val : -Inf
end
struct Metadata{TIdcs,TVal}
idcs::TIdcs
ranges::Vector{UnitRange{Int}}
vals::TVal
end
struct VarInfo{Tmeta,Tlogp}
metadata::Tmeta
logp::Base.RefValue{Tlogp}
num_produce::Base.RefValue{Int}
end
function acclogp!!(vi::VarInfo, logp)
vi.logp[] += logp
return vi
end
getlogp(vi::VarInfo) = vi.logp[]
mode = Enzyme.WithPrimal(Enzyme.set_runtime_activity(Enzyme.Reverse))
function data_poly(x)
return hcat([x]...)
end
function model_func(__varinfo__, x::Any;)
X = data_poly(x)
var"##retval#231" = for i = 1:1000
mu = X[i, 1]
right = Uniform(mu, mu + 1.0)
logp = logpdf(right, 0.0)
acclogp!!(__varinfo__, logp)
end
return (0.0, __varinfo__)
end
function logdensity_and_gradient(ldf, x)
∂ℓ_∂x = zero(x)
_, y = Enzyme.autodiff(mode, logdensity, Enzyme.Active,
Enzyme.Const(ldf), Enzyme.Duplicated(x, ∂ℓ_∂x))
y, ∂ℓ_∂x
end
function logdensity(model, θ::AbstractVector)
model, varinfo = model
varinfo.metadata.σ.vals[:] = θ
return getlogp(last(model(varinfo, x)))
end
function initialstep(model, vi)
∂logπ∂θ(x) = logdensity_and_gradient((model, vi), x)
hamiltonian = (; ∂ℓπ∂θ=∂logπ∂θ)
return (; hamiltonian=hamiltonian)
end
struct VarName{sym,T}
optic::T
end
x = (rand(1000) .- 0.5) .* 3
vn = VarName{:σ,typeof(identity)}(identity)
md = Metadata(
Dict(vn => 1),
[1:1],
[0.0],
)
vi_original = VarInfo((; σ=md), Ref(0.0), Ref(0))
state = initialstep(
model_func,
vi_original;
)
theta = [1.2260841057562286]
for _ in 1:500
state.hamiltonian.∂ℓπ∂θ(theta)
end
println("Done")
end This is real finicky, and also indeterministic. I typically have to run it 10-40 times, sometimes more, for it to segfault once. Confirmed to crash on both
and
with Enzyme v0.13.30. |
If it helps, the version that uses module MWE
using Distributions: Distributions
using Enzyme
using Random
Random.seed!(23)
struct Metadata{TIdcs,TVal}
idcs::TIdcs
ranges::Vector{UnitRange{Int}}
vals::TVal
end
struct VarInfo{Tmeta,Tlogp}
metadata::Tmeta
logp::Base.RefValue{Tlogp}
num_produce::Base.RefValue{Int}
end
function acclogp!!(vi::VarInfo, logp)
vi.logp[] += logp
return vi
end
getlogp(vi::VarInfo) = vi.logp[]
mode = Enzyme.WithPrimal(Enzyme.set_runtime_activity(Enzyme.Reverse))
function data_poly(x)
return hcat([x]...)
end
function model_func(__varinfo__, x::Any;)
X = data_poly(x)
var"##retval#231" = for i = 1:1000
mu = X[i, 1]
right = Distributions.Uniform(mu, mu + 1.0)
logp = Distributions.logpdf(right, 0.0)
acclogp!!(__varinfo__, logp)
end
return (0.0, __varinfo__)
end
function logdensity_and_gradient(ldf, x)
∂ℓ_∂x = zero(x)
_, y = Enzyme.autodiff(mode, logdensity, Enzyme.Active,
Enzyme.Const(ldf), Enzyme.Duplicated(x, ∂ℓ_∂x))
y, ∂ℓ_∂x
end
function logdensity(model, θ::AbstractVector)
model, varinfo = model
varinfo.metadata.σ.vals[:] = θ
return getlogp(last(model(varinfo, x)))
end
function initialstep(model, vi)
∂logπ∂θ(x) = logdensity_and_gradient((model, vi), x)
hamiltonian = (; ∂ℓπ∂θ=∂logπ∂θ)
return (; hamiltonian=hamiltonian)
end
struct VarName{sym,T}
optic::T
end
x = (rand(1000) .- 0.5) .* 3
vn = VarName{:σ,typeof(identity)}(identity)
md = Metadata(
Dict(vn => 1),
[1:1],
[0.0],
)
vi_original = VarInfo((; σ=md), Ref(0.0), Ref(0))
state = initialstep(
model_func,
vi_original;
)
theta = [1.2260841057562286]
for _ in 1:500
state.hamiltonian.∂ℓπ∂θ(theta)
end
println("Done")
end |
It does, though by chance can you get a version without the dependency on distributions? |
See message above, #2256 (comment). It still crashes, just less frequently. |
ah understood |
As discussed on Slack, this still crashes:
Example output:
This is another descendant of #1769, but distinct from #2197, which is now fixed.
I'll try to minimise to get rid of any TuringLang dependencies.
The text was updated successfully, but these errors were encountered: