Evolutionary Algorithms in Reinforcement Learning - Multi-objective Optimization in Inventory Management

Project

Motivation: Strike a balance between financial gains and transporation environmental impact of supply chain operations
Goal: Identify the trade-off solutions (Pareto front)
Key library: pymoo

Supply Chain Network in this problem

Screenshot 2023-10-02 at 11 32 29

Methodology

Screenshot 2023-10-02 at 00 55 32

Apply reinforcement learning framework
Use multi-objective evolutionary algorithms (MOEAs) to optimize the policy net
The MOEAs are: (1) NSGA-II (classic!), (2) AGE-MOEA (state-of-the-art).
Use Bayesian optimization to smart tune hyperparameters of the MOEAs

Result

Case 1: State formulation - Inventory level, backlog, unfulfilled order

Screenshot 2023-10-02 at 11 33 38

Converge within evaluation budget
Well-defined Pareto front

Case 2 (when agent knows more): State formulation - Inventory level, backlog, unfulfilled order + Previous customer demand

Screenshot 2023-10-02 at 11 33 57

Pareto front with better diversity if the agent has more info about the environment!

Investigation of NSGA-II hyperparameter:

(1) Ratio of number of offspring & population size
(2) Ratio of population size & number of generation

Screenshot 2023-10-02 at 11 39 58

Investigation of AGE-MOEA hyperparameter:

Ratio of population size & number of generation

Screenshot 2023-10-02 at 11 39 34

The hyperparameter ratios obtained by BO are the best (with highest hypervolume!

Summary

Novel methodology works for this multi-objective optimization (MOO) problem of inventory management, the first to combine RL+MOO.
BO can successfully fine-tune the hyperparameter
But more to expand on methodological front and supply chain environment setting.