RL-car

Training a reinforcement learning agent for OpenAI's Car Racing environment.

Algorithms used

Deep Q-Learning[1]

We implement a Deep Q-Network and its forward pass in the DQN class in model.py. Our network takes a single frame as input.

The training loop for the DeepQ network is defined in deepq.py file. The target network updations and the deepQ step are defined in the learning.py file.

The action space is defined in the action.py file. We experimented with various action sets and eventually decided to stick with the 7 actions as defined in the file.

schedule.py is the script that defines the exploration-exploitation tradeoff. We begin with a p_initial value of 1 which means we would like to focus on exploration early on during the training.

Double Deep Q-Learning[2]

We implement a Double Deep Q-Network and its forward pass in the DQN class in model.py. Our network takes a single frame as input similar to the Deep Q learning experiment.

The traning loop for the Double Deep Q network is defined in the file deepq_double.py. The target network updation and double deepQ step is defined in the learning_double.py file.

For this experiment, we use the same action spaces as the DeepQ experiment.

We use the same scheme for exploration-exploitation tradeoff as in the Deep Q leanring experiment.

Noticable techniques

Replay buffer for storing agent's memories
Target q-network to make q-learning stable

To install the gym environment

extract sdc_gym.zip
cd sdc_gym
pip install -e .["box2d"]

To run the evaluation code

python evaluate_racing.py score

References

Human-level control through deep reinforcement learning
Deep Reinforcement Learning with Double Q-learning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

RL-car

Algorithms used

Noticable techniques

To install the gym environment

To run the evaluation code

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

RL-car

Algorithms used

Noticable techniques

To install the gym environment

To run the evaluation code

References