You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
First and foremost, thanks for sharing the code. This is greatly appreciated.
Currently testing ARS in other learning environments and found that for very difficult environments the users of the code might face a divide by zero error, particularly at early stages of the learning process (ie, zero reward in all the initial rollouts).
# normalize rewards by their standard deviationrollout_rewards/=np.std(rollout_rewards)
Thanks,
The text was updated successfully, but these errors were encountered:
Hi,
First and foremost, thanks for sharing the code. This is greatly appreciated.
Currently testing ARS in other learning environments and found that for very difficult environments the users of the code might face a divide by zero error, particularly at early stages of the learning process (ie, zero reward in all the initial rollouts).
Thanks,
The text was updated successfully, but these errors were encountered: