You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi and thanks for sharing the code.
I've tried to run the training process on a different environment such as the BipedalWalkerHardcore-v2 but it seems that is not able to learn anything. I even tried with different shift values as noted in the code comments but still in the end I get a negative reward. Should we train for longer or there any hyperparams that we are missing?
The text was updated successfully, but these errors were encountered:
Hi and thanks for sharing the code.
I've tried to run the training process on a different environment such as the
BipedalWalkerHardcore-v2
but it seems that is not able to learn anything. I even tried with differentshift
values as noted in the code comments but still in the end I get a negative reward. Should we train for longer or there any hyperparams that we are missing?The text was updated successfully, but these errors were encountered: