Skip to content

This experiment learns the optimal policies by the method of Policy-Gradients in the Full Reinforcement Learning problem in the environment "CartPole" from OpenAI Gym.

Notifications You must be signed in to change notification settings

withai/Policy-Gradients-Full-RL-CartPole

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Learning with Policy Gradients in Full Reinforcement Learning environment provided by OpenAI Gym

Overview

Policy Gradients is one of the Reinforcement Learning Algorithm. In this experiment we consider a full RL problem, which means there are several states and each of our actions are in such a way that it not only considers current reward but also the rewards in the long run. Thus to make an optimal policy we should consider the temporal dynamics of the environment.

Dependencies

Credits

Most of the conceptual and programmatic understanding is borrowed from the Reinforcement Learning Series by Arthur Juliani here.

About

This experiment learns the optimal policies by the method of Policy-Gradients in the Full Reinforcement Learning problem in the environment "CartPole" from OpenAI Gym.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published