This repository contains the code for a project paper for a Master's module in the field of reinforcement learning. The aim of the project is to explore and implement Proximal Policy Optimization (PPO) agents to learn and play the 7x7 Hex game.
- Hex Game: Hex is a strategic board game with a large state space and requires long-term planning, presenting a significant challenge in the field of reinforcement learning.
- Experiments: Six experiments were conducted to evaluate different training methodologies, focusing on various reward shaping techniques and opponent configurations.
- Results: The results highlighted the agents’ ability to achieve high win rates against random opponents but revealed significant challenges in adaptability and strategic depth, particularly when switching roles or facing more complex strategies.
- Future Work: Future work should address these challenges by simplifying reward structures, extending training durations, and incorporating periodic evaluations to enhance learning efficacy and strategic proficiency.
- Torch: Used for the implementation and training of the PPO agents.