Implement RL model for basic SubjuGator tasks #1325

danushsingla · 2025-02-09T02:14:46Z

What needs to change?

We need to implement a basic RL model algorithm to SubjuGator using the ROS2 simulation so that the robot can automatically learn how to perform tasks.

How would this task be tested?

Download Python dependencies
Run a stable_baselines3 script that connects to the ROS2 simulation
Analyze stats about training/evaluation

danushsingla · 2025-02-09T02:18:07Z

I met with Mohana, Will, Daniel, and Keith about our future steps.

I have concluded that we should use stable baselines 3, implementing PPO through that, and that would serve as the way we train and evaluate the model. The next goal is to read through research papers, as outlined in Notion, and come up with what we should feed through the model.

I have outlined the following for Keith to provide. This information is what we, for now, think we should be sending to the PPO algorithm

Possible actions for the robot

This can be anything that the robot does like moving forward, backward, sideways, etc.
If a movement is continuous (like different thrust levels for moving forward) then please state that

States for the robot

I need information about what the robot is measuring
This can be something as simple as measuring speed or measuring the distance from the target
I essentially need to build the entire ROS2 world in a matrix of numbers
For these states, I also need possible ranges for each value. If the robot can have a top speed of 100 mph then state that and its min value (which is negative if it can go backwards)

willzoo · 2025-02-10T00:38:20Z

Over this week, I met with Danush Mohana and others to discuss the idea of using Reinforcement Learning algorithms to train Subjugator, as an alternative to writing missions manually. As Danush mentioned in his comment, we have decided on PPO as the training algorithm that would be the most optimal for us to use, as opposed to TRPO or GRPO. Although the PPO algorithm is abstracted through stable baselines 3, the best first step for us is to start reading through the research papers on the Notion so that we can get an understanding of how it works, and I have started on the PPO paper this week. Also, this is just my input, but I think it might be best to start with integrating the model with something simple like the ROS2 turtlesim, as a test run, before trying to integrate it with something as complex as SubjuGator.

mohana-pamidi · 2025-02-10T02:53:07Z

During this week, I was also able to meet with the team and was introduced to the idea of using a modified version of PPO Reinforcement learning algorithm. To gain background information on the algorithm and reinforcement learning in general, I began by reading the PPO Research paper and found how each of the requirements (Robot states/behaviors) that Danush mentioned fit into and affected the algorithm. Furthermore, I was able to understand the concept of the adaptive KL Penalty coefficient, which is something we talked about implementing with the Subjugator. I also agree with Will about how we can first test this with the TurtleSim first as it will be easier to implement and we can migrate the software to the Sub. Furthermore, I am not sure if we have to build the model completely from scratch, or if there's open-source software we can build from and optimize-but if we are going to try to optimize it in the future somehow, we could benefit from starting to think of areas we can optimize.

danushsingla added the software label Feb 9, 2025

danushsingla self-assigned this Feb 9, 2025

willzoo self-assigned this Feb 9, 2025

mohana-pamidi self-assigned this Feb 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement RL model for basic SubjuGator tasks #1325

Implement RL model for basic SubjuGator tasks #1325

danushsingla commented Feb 9, 2025

danushsingla commented Feb 9, 2025

willzoo commented Feb 10, 2025

mohana-pamidi commented Feb 10, 2025

Implement RL model for basic SubjuGator tasks #1325

Implement RL model for basic SubjuGator tasks #1325

Comments

danushsingla commented Feb 9, 2025

What needs to change?

How would this task be tested?

danushsingla commented Feb 9, 2025

willzoo commented Feb 10, 2025

mohana-pamidi commented Feb 10, 2025