Divide by zero #1

pedronahum · 2018-03-28T19:12:07Z

Hi,
First and foremost, thanks for sharing the code. This is greatly appreciated.

Currently testing ARS in other learning environments and found that for very difficult environments the users of the code might face a divide by zero error, particularly at early stages of the learning process (ie, zero reward in all the initial rollouts).

# normalize rewards by their standard deviation
rollout_rewards /= np.std(rollout_rewards)

Thanks,

hari-sikchi · 2019-02-03T11:13:01Z

I experienced this kind of difficulties in all sparse reward setting. Is ARS a good way to go for these optimization landscapes?

ashutoshtiwari13 · 2019-02-18T18:08:59Z

Can we use a .clip(min=1e-2) to avoid that ?

pedronahum · 2019-02-18T18:21:27Z

In my case, adding 1e-8 to the divisor made the trick...

ashutoshtiwari13 · 2019-02-18T18:24:10Z

yeah @pedronahum , that would do it too!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Divide by zero #1

Divide by zero #1

pedronahum commented Mar 28, 2018

hari-sikchi commented Feb 3, 2019

ashutoshtiwari13 commented Feb 18, 2019

pedronahum commented Feb 18, 2019 •

edited

Loading

ashutoshtiwari13 commented Feb 18, 2019

Divide by zero #1

Divide by zero #1

Comments

pedronahum commented Mar 28, 2018

hari-sikchi commented Feb 3, 2019

ashutoshtiwari13 commented Feb 18, 2019

pedronahum commented Feb 18, 2019 • edited Loading

ashutoshtiwari13 commented Feb 18, 2019

pedronahum commented Feb 18, 2019 •

edited

Loading