Evolutionary Algorithms in Reinforcement Learning - Multi-objective Optimization in Inventory Management
- Motivation: Strike a balance between financial gains and transporation environmental impact of supply chain operations
- Goal: Identify the trade-off solutions (Pareto front)
- Key library: pymoo
data:image/s3,"s3://crabby-images/e5a25/e5a25281de420afc10ff40f85ad26dbf942176db" alt="Screenshot 2023-10-02 at 11 32 29"
data:image/s3,"s3://crabby-images/04d7c/04d7c101dcd34c4d85217f614132fdfd95c11308" alt="Screenshot 2023-10-02 at 00 55 32"
- Apply reinforcement learning framework
- Use multi-objective evolutionary algorithms (MOEAs) to optimize the policy net
- The MOEAs are: (1) NSGA-II (classic!), (2) AGE-MOEA (state-of-the-art).
- Use Bayesian optimization to smart tune hyperparameters of the MOEAs
data:image/s3,"s3://crabby-images/d19f2/d19f28ff22c799ea537488f4cee3093ac3443ce6" alt="Screenshot 2023-10-02 at 11 33 38"
- Converge within evaluation budget
- Well-defined Pareto front
Case 2 (when agent knows more): State formulation - Inventory level, backlog, unfulfilled order + Previous customer demand
data:image/s3,"s3://crabby-images/580d2/580d2cd95cfde5c68f1652e27696dad5b560cfb3" alt="Screenshot 2023-10-02 at 11 33 57"
- Pareto front with better diversity if the agent has more info about the environment!
- (1) Ratio of number of offspring & population size
- (2) Ratio of population size & number of generation
data:image/s3,"s3://crabby-images/6e765/6e7654502289cba0459b037cc860b1604495f51e" alt="Screenshot 2023-10-02 at 11 39 58"
- Ratio of population size & number of generation
data:image/s3,"s3://crabby-images/5feee/5feee6896c22f3d26e31241f3eaba19bd219ecf4" alt="Screenshot 2023-10-02 at 11 39 34"
- The hyperparameter ratios obtained by BO are the best (with highest hypervolume!
- Novel methodology works for this multi-objective optimization (MOO) problem of inventory management, the first to combine RL+MOO.
- BO can successfully fine-tune the hyperparameter
- But more to expand on methodological front and supply chain environment setting.