The source-code used on the paper Multi-Agent Reinforcement Deep Learning with Emergent Communication, published on IJCNN'19. The paper describes the A3C2 algorithm, for multi-agent learning, with communication.
The implementation is done using Tensorflow2.
Contains 4 environments (Hidden Reward, Navigation, Pursuit, Traffic Intersection), and scripts to launch A3C2 and learn policies. Use the requirements.txt
to install your dependencies and run the scripts.
Each agent is defined by 3 networks.
The algorithm is distributed, and multiple workers update the networks.
Gradients are pushed across multiple time-steps to optimize the communication network and enforce communication.