Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly forming nested dual numbers. #671

Closed
vpuri3 opened this issue Nov 15, 2023 · 8 comments
Closed

Correctly forming nested dual numbers. #671

vpuri3 opened this issue Nov 15, 2023 · 8 comments

Comments

@vpuri3
Copy link

vpuri3 commented Nov 15, 2023

MWE

using ForwardDiff
f = x -> x .^ 2                                                                               
x = [1.0, 2.0, 3.0, 4.0]                                                                               
v = ones(4)                                                                     
# 1st order (works)                                                                     
y = ForwardDiff.Dual.(x, v)                                                     
fy = f(y)                                                                       
@assert ForwardDiff.value.(fy) == [1.0, 4.0, 9.0, 16.0]                         
@assert ForwardDiff.partials.(fy, 1) == [2.0, 4.0, 6.0, 8.0]
# 2nd order (incorrect 2nd derv)
using ForwardDiff: Dual, partials, value
z = Dual.(Dual.(x, v), Dual.(v, v))
fz = f(z)
4-element Vector{Dual{Nothing, Dual{Nothing, Float64, 1}, 1}}:
  Dual{Nothing}(Dual{Nothing}(1.0,2.0),Dual{Nothing}(2.0,4.0))
  Dual{Nothing}(Dual{Nothing}(4.0,4.0),Dual{Nothing}(4.0,6.0))
  Dual{Nothing}(Dual{Nothing}(9.0,6.0),Dual{Nothing}(6.0,8.0))
 Dual{Nothing}(Dual{Nothing}(16.0,8.0),Dual{Nothing}(8.0,10.0))

Notice that the 2nd derivative (which should be [2.0, 2.0, 2.0, 2.0]

julia> partials.(partials.(fz, 1), 1)
4-element Vector{Float64}:
  4.0
  6.0
  8.0
 10.0

is off exactly by the first derivative.

julia> partials.(partials.(fz, 1), 1) - partials.(value.(fz), 1)
4-element Vector{Float64}:
 2.0
 2.0
 2.0
 2.0

I noticed this pattern in another function: f = x -> exp.(x):

4-element Vector{Dual{Nothing, Dual{Nothing, Float64, 1}, 1}}:
  Dual{Nothing}(Dual{Nothing}(2.718281828459045,2.718281828459045),Dual{Nothing}(2.718281828459045,5.43656365691809))
  Dual{Nothing}(Dual{Nothing}(7.38905609893065,7.38905609893065),Dual{Nothing}(7.38905609893065,14.7781121978613))
 Dual{Nothing}(Dual{Nothing}(20.085536923187668,20.085536923187668),Dual{Nothing}(20.085536923187668,40.171073846375336))
 Dual{Nothing}(Dual{Nothing}(54.598150033144236,54.598150033144236),Dual{Nothing}(54.598150033144236,109.19630006628847))
@vpuri3
Copy link
Author

vpuri3 commented Nov 15, 2023

Update: z = Dual.(Dual.(x, 1v), Dual.(1v, 0v)) gives the right results.

For x -> x .^ 2

4-element Vector{Dual{Nothing, Dual{Nothing, Float64, 1}, 1}}:
  Dual{Nothing}(Dual{Nothing}(1.0,2.0),Dual{Nothing}(2.0,2.0))
  Dual{Nothing}(Dual{Nothing}(4.0,4.0),Dual{Nothing}(4.0,2.0))
  Dual{Nothing}(Dual{Nothing}(9.0,6.0),Dual{Nothing}(6.0,2.0))
 Dual{Nothing}(Dual{Nothing}(16.0,8.0),Dual{Nothing}(8.0,2.0))

For x -> exp.(x)

julia> include("examples/burgers_fourier/visc_burg_param_ic/autodecode.jl")
4-element Vector{Dual{Nothing, Dual{Nothing, Float64, 1}, 1}}:
  Dual{Nothing}(Dual{Nothing}(2.718281828459045,2.718281828459045),Dual{Nothing}(2.718281828459045,2.718281828459045))
  Dual{Nothing}(Dual{Nothing}(7.38905609893065,7.38905609893065),Dual{Nothing}(7.38905609893065,7.38905609893065))
 Dual{Nothing}(Dual{Nothing}(20.085536923187668,20.085536923187668),Dual{Nothing}(20.085536923187668,20.085536923187668))
 Dual{Nothing}(Dual{Nothing}(54.598150033144236,54.598150033144236),Dual{Nothing}(54.598150033144236,54.598150033144236))

@mcabbott
Copy link
Member

I think this is perturbation confusion. Making Dual{Nothing} means you are circumventing this package's mechanisms for keeping track of multiple perturbations. The user-facing functions seem to work fine:

julia> ForwardDiff.derivative(x -> ForwardDiff.derivative(f, x), 3.0)
2.0

julia> ForwardDiff.derivative(x -> ForwardDiff.derivative(f, x), 4.0)
2.0

@vpuri3
Copy link
Author

vpuri3 commented Nov 15, 2023

@mcabbott I am getting similar behavior with different tags.

using ForwardDiff                                                               
using ForwardDiff: Dual, value, partials                                        

f = x -> x .^ 2                                                                 
x = [1.0, 2.0, 3.0, 4.0]                                                        
v = ones(4)                                                                     

# 2st order                                                                     
z = Dual{:FD_D2Tag}.(                                                           
    Dual{:FD_D2TagInt}.(x, 1v),                                                 
    Dual{:FD_D2TagInt}.(1v, 1v)                                                 
)                                                                               
fz = f(z)                                                                       
fx = value.(value.(fz))                                                         
df = value.(partials.(fz, 1))                                                   
d2f = partials.(partials.(fz, 1), 1)                                            

display(fz)                                                                              
4-element Vector{Dual{:FD_D2Tag, Dual{:FD_D2TagInt, Float64, 1}, 1}}:
 Dual{:FD_D2Tag}(Dual{:FD_D2TagInt}(1.0,2.0),Dual{:FD_D2TagInt}(2.0,4.0))
 Dual{:FD_D2Tag}(Dual{:FD_D2TagInt}(4.0,4.0),Dual{:FD_D2TagInt}(4.0,6.0))
 Dual{:FD_D2Tag}(Dual{:FD_D2TagInt}(9.0,6.0),Dual{:FD_D2TagInt}(6.0,8.0))
 Dual{:FD_D2Tag}(Dual{:FD_D2TagInt}(16.0,8.0),Dual{:FD_D2TagInt}(8.0,10.0))

@mcabbott
Copy link
Member

mcabbott commented Nov 15, 2023

Maybe it's not perturbation confusion, just wrong inputs? If you print out what the perturbations created by the user-facing function, you get this:

julia> f(x) = x .^ 2; f(x::Real) = @show(x) ^ 2;

julia> ForwardDiff.derivative(x -> ForwardDiff.derivative(f, x), 3.0)
x = Dual{ForwardDiff.Tag{typeof(f), Dual{ForwardDiff.Tag{var"#41#42", Float64}, Float64, 1}}}(Dual{ForwardDiff.Tag{var"#41#42", Float64}}(3.0,1.0),Dual{ForwardDiff.Tag{var"#41#42", Float64}}(1.0,0.0))
2.0

julia> Dual{:b}.(Dual{:a}.(x, 1), Dual{:a}.(1, 0)) |> f
4-element Vector{Dual{:b, Dual{:a, Float64, 1}, 1}}:
  Dual{:b}(Dual{:a}(1.0,2.0),Dual{:a}(2.0,2.0))
  Dual{:b}(Dual{:a}(4.0,4.0),Dual{:a}(4.0,2.0))
  Dual{:b}(Dual{:a}(9.0,6.0),Dual{:a}(6.0,2.0))
 Dual{:b}(Dual{:a}(16.0,8.0),Dual{:a}(8.0,2.0))

julia> Dual.(Dual.(x, 1), 1) |> f
4-element Vector{Dual{Nothing, Dual{Nothing, Float64, 1}, 1}}:
  Dual{Nothing}(Dual{Nothing}(1.0,2.0),Dual{Nothing}(2.0,2.0))
  Dual{Nothing}(Dual{Nothing}(4.0,4.0),Dual{Nothing}(4.0,2.0))
  Dual{Nothing}(Dual{Nothing}(9.0,6.0),Dual{Nothing}(6.0,2.0))
 Dual{Nothing}(Dual{Nothing}(16.0,8.0),Dual{Nothing}(8.0,2.0))

@vpuri3
Copy link
Author

vpuri3 commented Nov 15, 2023

I see. So I was forming z incorrectly. It should be done like Dual.(Dual.(x, true), true) which is equivalent to what I was doing.

julia> Dual.(Dual.(x, true), true) == Dual.(Dual.(x, v), Dual.(v, v))
true

julia> Dual.(Dual.(x, true), true)
4-element Vector{Dual{Nothing, Dual{Nothing, Float64, 1}, 1}}:
 Dual{Nothing}(Dual{Nothing}(1.0,1.0),Dual{Nothing}(1.0,0.0))
 Dual{Nothing}(Dual{Nothing}(2.0,1.0),Dual{Nothing}(1.0,0.0))
 Dual{Nothing}(Dual{Nothing}(3.0,1.0),Dual{Nothing}(1.0,0.0))
 Dual{Nothing}(Dual{Nothing}(4.0,1.0),Dual{Nothing}(1.0,0.0))

Thanks @mcabbott for figuring this out

@vpuri3 vpuri3 changed the title Nested duals returns wrong second derivative Correctly forming nested dual numbers. Nov 15, 2023
vpuri3 added a commit to vpuri3/ForwardDiff.jl that referenced this issue Nov 15, 2023
Adding code for forming nested duals in the docs so folks don't make the mistakes I made in JuliaDiff#671
@vpuri3
Copy link
Author

vpuri3 commented Nov 15, 2023

made a PR to add docs describing how to form nested duals.

@mcabbott
Copy link
Member

which is equivalent to what I was doing.

No, this is not true. It's just that == on the tagged version of this package ignores duals. On master it does not:

julia> Dual.(Dual.(x, true), true)
4-element Vector{Dual{Nothing, Dual{Nothing, Float64, 1}, 1}}:
 Dual{Nothing}(Dual{Nothing}(1.0,1.0),Dual{Nothing}(1.0,0.0))
 Dual{Nothing}(Dual{Nothing}(2.0,1.0),Dual{Nothing}(1.0,0.0))
 Dual{Nothing}(Dual{Nothing}(3.0,1.0),Dual{Nothing}(1.0,0.0))
 Dual{Nothing}(Dual{Nothing}(4.0,1.0),Dual{Nothing}(1.0,0.0))

julia> Dual.(Dual.(x, v), Dual.(v, v))  # original guess above
4-element Vector{Dual{Nothing, Dual{Nothing, Float64, 1}, 1}}:
 Dual{Nothing}(Dual{Nothing}(1.0,1.0),Dual{Nothing}(1.0,1.0))
 Dual{Nothing}(Dual{Nothing}(2.0,1.0),Dual{Nothing}(1.0,1.0))
 Dual{Nothing}(Dual{Nothing}(3.0,1.0),Dual{Nothing}(1.0,1.0))
 Dual{Nothing}(Dual{Nothing}(4.0,1.0),Dual{Nothing}(1.0,1.0))

julia> Dual.(Dual.(x, true), true) == Dual.(Dual.(x, v), Dual.(v, v))  # on master
false

@vpuri3
Copy link
Author

vpuri3 commented Nov 15, 2023

I see. I've changed my code to use Dual.(Dual.(x, true), true). Thank you.

@vpuri3 vpuri3 closed this as completed Nov 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants