Skip to content

Deep Reinforcement Learning for mobile robot navigation in IR-SIM simulation. Using DRL (SAC, TD3, PPO, DDPG) neural networks, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.

Notifications You must be signed in to change notification settings

reiniscimurs/DRL-robot-navigation-IR-SIM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DRL Robot navigation in IR-SIM

Deep Reinforcement Learning algorithm implementation for simulated robot navigation in IR-SIM. Using 2D laser sensor data and information about the goal point a robot learns to navigate to a specified point in the environment.

Example

Installation

  • Package versioning is managed with poetry
    pip install poetry
  • Clone the repository
    git clone https://github.com/reiniscimurs/DRL-robot-navigation.git
  • Navigate to the cloned location and install using poetry
    poetry install

Training the model

  • Run the training by executing the train.py file
    poetry run python robot_nav/train.py

  • To open tensorbord, in a new terminal execute
    tensorboard --logdir runs

Sources

Package Description Source
IR-SIM Light-weight robot simulator https://github.com/hanruihua/ir-sim

Models

Model Description Model Source
TD3 Twin Delayed Deep Deterministic Policy Gradient model https://github.com/reiniscimurs/DRL-Robot-Navigation-ROS2
SAC Soft Actor-Critic model https://github.com/denisyarats/pytorch_sac
PPO Proximal Policy Optimization model https://github.com/nikhilbarhate99/PPO-PyTorch
DDPG Deep Deterministic Policy Gradient model Updated from TD3
CNNTD3 TD3 model with 1D CNN encoding of laser state -
RCPG Recurrent Convolution Policy Gradient - adding recurrence layers (lstm/gru/rnn) to CNNTD3 model -

Max Upper Bound Models

Models that support the additional loss of Q values exceeding the maximal possible Q value in the episode. Q values that exceed this upper bound are used to calculate a loss for the model. This helps to control the overestimation of Q values in off-policy actor-critic networks. To enable max upper bound loss set use_max_bound = True when initializing a model.

Model
TD3
DDPG
CNNTD3

About

Deep Reinforcement Learning for mobile robot navigation in IR-SIM simulation. Using DRL (SAC, TD3, PPO, DDPG) neural networks, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages