Evolutionary Algorithms in Reinforcement Learning - Multi-objective Optimization in Inventory Management
- Motivation: Strike a balance between financial gains and transporation environmental impact of supply chain operations
- Goal: Identify the trade-off solutions (Pareto front)
- Key library: pymoo
data:image/s3,"s3://crabby-images/f9e09/f9e094699e2f613ac535a48b46d9cd1362a15ec0" alt="Screenshot 2023-10-02 at 11 32 29"
data:image/s3,"s3://crabby-images/02147/02147bed2f2b7b119538883b64297d28974550e4" alt="Screenshot 2023-10-02 at 00 55 32"
- Apply reinforcement learning framework
- Use multi-objective evolutionary algorithms (MOEAs) to optimize the policy net
- The MOEAs are: (1) NSGA-II (classic!), (2) AGE-MOEA (state-of-the-art).
- Use Bayesian optimization to smart tune hyperparameters of the MOEAs
data:image/s3,"s3://crabby-images/04a34/04a346817253456965b4b7f7b3dab498ae468d76" alt="Screenshot 2023-10-02 at 11 33 38"
- Converge within evaluation budget
- Well-defined Pareto front
Case 2 (when agent knows more): State formulation - Inventory level, backlog, unfulfilled order + Previous customer demand
data:image/s3,"s3://crabby-images/0b992/0b992a8e3f6c4b7af214c6b726299f80ed442dc8" alt="Screenshot 2023-10-02 at 11 33 57"
- Pareto front with better diversity if the agent has more info about the environment!
- (1) Ratio of number of offspring & population size
- (2) Ratio of population size & number of generation
data:image/s3,"s3://crabby-images/690cb/690cb3266f0b336fd7f2af71328b4d9715b6150a" alt="Screenshot 2023-10-02 at 11 39 58"
- Ratio of population size & number of generation
data:image/s3,"s3://crabby-images/7ae7e/7ae7ea91ce30f82ba1624d942bcb3466d947bcfa" alt="Screenshot 2023-10-02 at 11 39 34"
- The hyperparameter ratios obtained by BO are the best (with highest hypervolume!
- Novel methodology works for this multi-objective optimization (MOO) problem of inventory management, the first to combine RL+MOO.
- BO can successfully fine-tune the hyperparameter
- But more to expand on methodological front and supply chain environment setting.