Summary
Enzyme.gradient(set_runtime_activity(Reverse), Const(loss), x) returns an all-zero gradient when:
- A
mutable struct instance holding a Vector{Float64} is captured by a closure.
- The loss constructs a new instance of the same struct whose
Vector field aliases the captured buffer (i.e. it stores the same Vector object, no copy).
- The loss mutates that buffer through the new struct from the active input.
- The loss reads
sum(buffer).
Manually copying the buffer before mutation (so the alias is broken) gives the correct gradient that matches FiniteDiff to >8 sig figs. Plain Reverse (no runtime activity) raises EnzymeRuntimeActivityError on the aliasing version; set_runtime_activity(Reverse) runs through and silently returns zeros.
This was reduced from a SciML / ModelingToolkit MTKParameters case where the closure captures iprob.p, a repack callback returns a new MTKParameters whose caches::Tuple{Vector{Float64}} field aliases iprob.p.caches, and solve! then mutates p_new.caches[1]. The standalone version below has no SciML dependencies.
MWE
using Enzyme, FiniteDiff
mutable struct Holder
v::Vector{Float64}
end
const captured = Holder([0.0, 0.0, 0.0])
function loss_alias(t::Vector{Float64})
h = Holder(captured.v) # NEW struct, .v aliases captured.v
for i in eachindex(h.v)
h.v[i] = t[i]^2
end
return sum(h.v)
end
function loss_copy(t::Vector{Float64})
h = Holder(copy(captured.v)) # break the alias
for i in eachindex(h.v)
h.v[i] = t[i]^2
end
return sum(h.v)
end
t0 = [1.0, 2.0, 3.0]
mode = set_runtime_activity(Reverse)
@show FiniteDiff.finite_difference_gradient(loss_alias, t0)
@show FiniteDiff.finite_difference_gradient(loss_copy, t0)
@show Enzyme.gradient(mode, Const(loss_alias), t0)
@show Enzyme.gradient(mode, Const(loss_copy), t0)
Output
loss_alias(t0) = 14.0
loss_copy(t0) = 14.0
FD grad (alias) = [2.0000000000471077, 3.9999999999475415, 5.999999999994649]
FD grad (copy) = [2.0000000000471077, 3.9999999999475415, 5.999999999994649]
Enz grad (alias) = ([0.0, 0.0, 0.0],) # WRONG
Enz grad (copy) = ([2.0, 4.0, 6.0],) # correct, matches FD
With plain Reverse (no runtime activity), the loss_alias call instead raises:
EnzymeRuntimeActivityError: Detected potential need for runtime activity.
... Failure within method: getproperty(::Holder, ::Symbol) ...
so the issue manifests as either a hard error (plain Reverse) or a silently wrong zero gradient (set_runtime_activity(Reverse)).
Expected
Enzyme.gradient(mode, Const(loss_alias), t0) == [2.0, 4.0, 6.0] (same as the loss_copy version, same as FiniteDiff). The captured Holder is Const from Enzyme's perspective, but the buffer inside it is being treated as the active storage for the gradient computation in this call, and Enzyme should follow the alias and accumulate into it.
Workaround
Manually copy the buffer before storing it in the new struct, so the new struct does not alias any closure-captured storage.
Versions
- Julia: 1.12.6
- Enzyme: v0.13.150
- FiniteDiff: v2.31.0
- OS: Linux x86_64
Summary
Enzyme.gradient(set_runtime_activity(Reverse), Const(loss), x)returns an all-zero gradient when:mutable structinstance holding aVector{Float64}is captured by a closure.Vectorfield aliases the captured buffer (i.e. it stores the sameVectorobject, no copy).sum(buffer).Manually copying the buffer before mutation (so the alias is broken) gives the correct gradient that matches
FiniteDiffto >8 sig figs. PlainReverse(no runtime activity) raisesEnzymeRuntimeActivityErroron the aliasing version;set_runtime_activity(Reverse)runs through and silently returns zeros.This was reduced from a SciML / ModelingToolkit
MTKParameterscase where the closure capturesiprob.p, arepackcallback returns a newMTKParameterswhosecaches::Tuple{Vector{Float64}}field aliasesiprob.p.caches, andsolve!then mutatesp_new.caches[1]. The standalone version below has no SciML dependencies.MWE
Output
With plain
Reverse(no runtime activity), theloss_aliascall instead raises:so the issue manifests as either a hard error (plain
Reverse) or a silently wrong zero gradient (set_runtime_activity(Reverse)).Expected
Enzyme.gradient(mode, Const(loss_alias), t0) == [2.0, 4.0, 6.0](same as theloss_copyversion, same as FiniteDiff). The capturedHolderisConstfrom Enzyme's perspective, but the buffer inside it is being treated as the active storage for the gradient computation in this call, and Enzyme should follow the alias and accumulate into it.Workaround
Manually
copythe buffer before storing it in the new struct, so the new struct does not alias any closure-captured storage.Versions