Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dueling dqn equation #1

Open
HencyChen opened this issue Jan 15, 2018 · 1 comment
Open

Dueling dqn equation #1

HencyChen opened this issue Jan 15, 2018 · 1 comment

Comments

@HencyChen
Copy link

Thanks for offering this wonderful code. But I have a question.

  1. Why in the combination part of the equation, the advantage A need to subtract it's average? I've already refer to the paper but still don't understand.
@HareshKarnan
Copy link

^ because of the fact that there can be multiple V(s) and A(s,a) that satisfy the Advantage equation. For example,

Q(s,a) = V(s) + A(s,a) = (V(s)+c) + (A(s,a)-c)

So, to learn that unique V and A, you subtract mean of Advantage for actions so the advantage for the optimal action is 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants