Learning with Policy Gradients in Full Reinforcement Learning environment provided by OpenAI Gym

Overview

Policy Gradients is one of the Reinforcement Learning Algorithm. In this experiment we consider a full RL problem, which means there are several states and each of our actions are in such a way that it not only considers current reward but also the rewards in the long run. Thus to make an optimal policy we should consider the temporal dynamics of the environment.

Dependencies

Jupyter Notebook
Numpy
OpenAI Gym (https://gym.openai.com/docs/)
Tensorflow (https://www.tensorflow.org/install/)

Credits

Most of the conceptual and programmatic understanding is borrowed from the Reinforcement Learning Series by Arthur Juliani here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Learning with Policy Gradients in Full Reinforcement Learning environment provided by OpenAI Gym

Overview

Dependencies

Credits

Files

README.md

Latest commit

History

README.md

File metadata and controls

Learning with Policy Gradients in Full Reinforcement Learning environment provided by OpenAI Gym

Overview

Dependencies

Credits