Learning with Policy Gradients in Full Reinforcement Learning environment provided by OpenAI Gym

Overview

Policy Gradients is one of the Reinforcement Learning Algorithm. In this experiment we consider a full RL problem, which means there are several states and each of our actions are in such a way that it not only considers current reward but also the rewards in the long run. Thus to make an optimal policy we should consider the temporal dynamics of the environment.

Dependencies

Jupyter Notebook
Numpy
OpenAI Gym (https://gym.openai.com/docs/)
Tensorflow (https://www.tensorflow.org/install/)

Credits

Most of the conceptual and programmatic understanding is borrowed from the Reinforcement Learning Series by Arthur Juliani here.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Full RL Policy Gradients.ipynb		Full RL Policy Gradients.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning with Policy Gradients in Full Reinforcement Learning environment provided by OpenAI Gym

Overview

Dependencies

Credits

About

Releases

Packages

Languages

withai/Policy-Gradients-Full-RL-CartPole

Folders and files

Latest commit

History

Repository files navigation

Learning with Policy Gradients in Full Reinforcement Learning environment provided by OpenAI Gym

Overview

Dependencies

Credits

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages