In this project, I trained two multi-head agents to solve the Platform Domain environment.
See Platform Domain for the details.
The first is multi-head DQN or MH-DQN for short. It is inspired from Multi-Pass Q-Networks for Deep Reinforcement Learning with Parameterised Action Spaces.
The second is multi-head TD3 or MH-TD3 for short. It is kind of fusion between MH-DQN and TD3.
To set up your python environment to run the code in this repository, follow the instructions below.
-
Create (and activate) a new environment with Python 3.9.
- Linux or Mac:
conda create --name drlnd source activate drlnd
- Windows:
conda create --name drlnd activate drlnd
-
Follow the instructions in Pytorch web page to install pytorch and its dependencies (PIL, numpy,...). For Windows or Linux and cuda 11.7
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
-
Install matplotlib
conda install -c conda-forge matplotlib
-
Follow the instructions in Platform Domain to install the environment.
-
Clone the repository, and navigate to your working folder. Then, install several dependencies.
git clone https://github.com/eljandoubi/hybrid-RL-platfrom.git
cd hybrid-RL-platfrom
(Make sure that your path
is deferent from saved/MH-DQN
and saved/MH-TD3
)
You can train, test and visualization an agent by following instructions below.
For MH-DQN agent with default settings and save it your save folder path
, run :
python solve_platform.py --path path
If you want to switch to MH-TD3 agent, all you need is :
python solve_platform.py --agent_name MH_TD3 --path path
You can choose which task to execute (train, test, visu or all) by the task argument.
For example, to visualize an agent agent
saved in path
:
python solve_platform.py --agent_name agent --task visu --path path
For more options, see the help :
python solve_platform.py -h
To evaluate (task=test
) and/or watch task=visu
my optimal agent agent
(the default --path
contains my saved models):
python solve_platform.py --agent_name agent --task task
To train the optimal MH_TD3 (mean reward : 0.9975) or MH_DQN (mean reward : 0.9998):
python solve_platform.py --agent_name agent --task train --path path