A reinforcement learning project for centralized scheduling in communication networks. Implements classical dynamic programming (Value Iteration, Policy Iteration) and modern RL methods (Q-Learning, Deep Q-Networks) to optimize user transmission scheduling with constraints on delay, energy, and communication quality.
The goal is to learn optimal scheduling policies using various reinforcement learning approaches, progressively increasing in complexity:
-
Lab 1: Classical methods
- Value Iteration
- Policy Iteration
-
Lab 2: Model-free methods
- Q-Learning
- Deep Q-Learning (DQN)
These labs are part of the MICAS912 course: Sequential Decision-Making Processing – Part II: Reinforcement Learning.