A Gymnasium environment for simulating and training reinforcement learning agents on the BlueROV2 underwater vehicle. This environment provides a realistic simulation of the BlueROV2's dynamics and supports various control tasks.
- Realistic Physics: Implements validated hydrodynamic model of the BlueROV2
- 3D Visualization: Real-time 3D rendering using Meshcat
- Custom Rewards: Configurable reward functions for different tasks
- Disturbance Modeling: Includes environmental disturbances for realistic underwater conditions
- Stable-Baselines3 Compatible: Ready to use with popular RL frameworks
- Customizable Environment: Easy to modify for different underwater tasks
- (Future release: spawn multiple AUVs)
- Python ≥3.10
- uv (recommended) or pip
# Clone the repository
git clone https://github.com/gokulp01/bluerov2_gym.git
cd bluerov2_gym
# Create and activate a virtual environment
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install the package
uv pip install -e .
# Clone the repository
git clone https://github.com/gokulp01/bluerov2_gym.git
cd bluerov2_gym
# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install the package
pip install -e .
import gymnasium as gym
import bluerov2_gym
# Create the environment
env = gym.make("BlueRov-v0", render_mode="human")
# Reset the environment
observation, info = env.reset()
# Run a simple control loop
while True:
# Take a random action
action = env.action_space.sample()
observation, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
observation, info = env.reset()
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize
# Create and wrap the environment
env = gym.make("BlueRov-v0")
env = DummyVecEnv([lambda: env])
env = VecNormalize(env)
# Initialize the agent
model = PPO("MultiInputPolicy", env, verbose=1)
# Train the agent
model.learn(total_timesteps=1_000_000)
# Save the trained model
model.save("bluerov_ppo")
The environment uses a Dictionary observation space containing:
x, y, z
: Position coordinatestheta
: Yaw anglevx, vy, vz
: Linear velocitiesomega
: Angular velocity
Continuous action space with 4 dimensions:
- Forward/Backward thrust
- Left/Right thrust
- Up/Down thrust
- Yaw rotation
The default reward function considers:
- Position error from target
- Velocity penalties
- Orientation error
- Custom rewards can be implemented by extending the
Reward
class
The examples
directory contains several scripts demonstrating different uses:
test.py
: Basic environment testing with manual control and evaluation with trained modeltrain.py
: Training script using PPO
# Test environment with manual control
python examples/test.py
# Train an agent
python examples/train.py
The environment uses Meshcat for 3D visualization. When running with render_mode="human"
, a web browser window will open automatically showing the simulation. The visualization includes:
- Water surface effects
- Underwater environment
- ROV model
- Ocean floor with decorative elements (I am no good at this)
bluerov2_gym/
├── bluerov2_gym/ # Main package directory
│ ├── assets/ # 3D models and resources
│ └── envs/ # Environment implementation
│ ├── core/ # Core components
│ │ ├── dynamics.py # Physics simulation
│ │ ├── rewards.py # Reward functions
│ │ ├── state.py # State management
│ │ └── visualization/
│ │ └── renderer.py # 3D visualization
│ └── bluerov_env.py # Main environment class
├── examples/ # Example scripts
├── tests/ # Test cases
└── README.md
The environment can be configured through various parameters:
- Physics parameters in
dynamics.py
- Reward weights in
rewards.py
- Visualization settings in
renderer.py
If you use this environment in your research, please cite:
@article{puthumanaillam2024tabfieldsmaximumentropyframework,
title={TAB-Fields: A Maximum Entropy Framework for Mission-Aware Adversarial Planning},
author={Gokul Puthumanaillam and Jae Hyuk Song and Nurzhan Yesmagambet and Shinkyu Park and Melkior Ornik},
year={2024},
eprint={2412.02570},
archivePrefix={arXiv},
url={https://arxiv.org/abs/2412.02570} }
}
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License
- BlueRobotics for the BlueROV2 specifications
- OpenAI/Farama Foundation for the Gymnasium framework
- Meshcat for the visualization library
Gokul Puthumanaillam - @gokulp01 - [gokulp2@illinois.edu]
Project Link: https://github.com/gokulp01/bluerov2_gym