Snake PPO

A pedagogical Proximal Policy Optimization (PPO) project applied to the classic Snake game.

Overview

This repository demonstrates how an agent, based on the PPO algorithm, learns to play Snake by collecting food and avoiding collisions. Over time, the agent discovers a strategy of quickly circling around the reward before taking it to minimize self-collisions, especially since it does not precisely track its entire tail position.

The environment is implemented in Game/Snake.py.
The PPO agent logic is in Agent/PPO.py.
Usage examples and training procedure are shown in main.ipynb.

Installation

Clone this repository.
Install dependencies:
```
pip install -r requirements.txt
```
You can then run or modify main.ipynb to train or test the PPO agent.

PPO Algorithm

PPO is a policy gradient method designed to stabilize training by limiting updates to the policy. In this project:

We use clipping (controlled by "clip_epsilon") to avoid overly large updates.
We incorporate entropy regularization (controlled by "entropy_coef") to encourage exploration.
We apply Generalized Advantage Estimation (GAE) for more stable advantage computation.

The core training code is in the train function of PPOAgent, and the environment loops in SnakeEnv.

Demo

Below is a demo of the trained agent playing, occasionally circling around the reward to avoid self-collisions (it doesn't know its tail's exact position, see main.ipynb for the observation space details):

Results

The agent’s reward curve increases as it masters collecting food. Episode length first rises with better survival but eventually decreases when it chooses to sacrifice longevity for quicker gains:

Usage

Train the agent by running:

# Inside main.ipynb
agent.train(total_epochs=10000, steps_per_epoch=4096)

Test the trained agent (with optional rendering):
```
agent.test_episode(render=True)
```

Explore main.ipynb for more details on experiments, and review the in-code comments for deeper understanding.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Agent		Agent
Game		Game
assets		assets
models		models
utils		utils
.gitignore		.gitignore
README.md		README.md
main.ipynb		main.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Snake PPO

Overview

Installation

PPO Algorithm

Demo

Results

Usage

About

Releases

Packages

Languages

DREI-8/Snake_PPO

Folders and files

Latest commit

History

Repository files navigation

Snake PPO

Overview

Installation

PPO Algorithm

Demo

Results

Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages