Learning with AMIGo: Adversarially Motivated Intrinsic Goals

This repository contains the re-implementation of the paper Learning with AMIGo: Adversarially Motivated Intrinsic GOals.

The repository is organized in the following way:

The training process is fully implemented in main/main.py, where functions for acting in the environment, training the model and testing pre-trained results are contained
The neural network models for the teacher and the student together with a state-embedding network are contained in the models/models.py folder
Functions utilized in the training loop are contained in utils_folder/utils.py and utils_folder/env_utils.py
torchbeast/core contains util functions to implement the IMPALA algorithm for fast distributed policy gradient

References

Our project reimplements AMIGo, which is described in the following paper:

@article{campero2020learning,
  title={Learning with AMIGo: Adversarially Motivated Intrinsic Goals},
  author={Campero, Andres and Raileanu, Roberta and K{\"u}ttler, Heinrich and Tenenbaum, Joshua B and Rockt{\"a}schel, Tim and Grefenstette, Edward},
  journal={arXiv preprint arXiv:2006.12122},
  year={2020}
}

However, some of the experiments leverage the work described in:

@misc{zhang2020bebold,
      title={BeBold: Exploration Beyond the Boundary of Explored Regions}, 
      author={Tianjun Zhang and Huazhe Xu and Xiaolong Wang and Yi Wu and Kurt Keutzer and Joseph E. Gonzalez and Yuandong Tian},
      year={2020},
      eprint={2012.08621},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Installation

# create a new conda environment
conda create -n amigo python=3.7
conda activate amigo

# install dependencies
git clone https://github.com/allepalma/adversarially-motivated-intrinsic-goals.git
cd adversarially-motivated-intrinsic-goals
pip install -r requirements.txt

Running Experiments

Three main experiments can be reproduced via the main/main.py script:

The default AMIGo run
AMIGo with adaptive t* threshold
An AMIGo model where the intrinsic reward function is substituted with one derived from random network distillation

Train AMIGo on MiniGrid

# Run AMIGo on MiniGrid Environment
OMP_NUM_THREADS=1 python -m main.main --env MiniGrid-KeyCorridorS3R3-v0 \
--num_actors 40 --modify --generator_batch_size 150 --generator_entropy_cost .05 \
--generator_threshold -.1 --total_frames 100000000 --generator_reward_negative -.3 \
--savedir ./experimentMinigrid --model default

In our project, we mostly worked on the MiniGrid-KeyCorridorS3R3-v0 envronment. However, the main.py script can be run on any other available MiniGrid envoronment as well. The parameters here are the dafault ones from the paper.

Train AMIGo with adaptive t* threshold

# Run AMIGo on MiniGrid Environment
OMP_NUM_THREADS=1 python -m main.main --env MiniGrid-KeyCorridorS3R3-v0 \
--num_actors 40 --modify --generator_batch_size 150 --generator_entropy_cost .05 \
--generator_threshold -.5 --total_frames 100000000 --generator_reward_negative -.3 \
--savedir ./experimentMinigrid --model window_addaptive --window 50

The --window flag can be changed to any preferred value.

Train AMIGo with BeBold IR function

# Run AMIGo on MiniGrid Environment
OMP_NUM_THREADS=1 python -m main.main --env MiniGrid-KeyCorridorS3R3-v0 \
--num_actors 10 --modify --total_frames 100000000 \
--savedir ./experimentMinigrid --model novelty_based

Less actors are employed since it yielded better learning outcomes.

Test AMIGo on MiniGrid

# Run AMIGo on MiniGrid Environment
OMP_NUM_THREADS=1 python -m main.main --env trained_amigo_environment  \
--mode test --weight_path path_to_saved_weights --record_video --video_path path_to_video.mp4

The flag --model followed by default, window_adaptive and novelty_based must as well be specified.

If the flag --record_video is used, an mp4 video of a random rollout by the trained agent will be produced at the selected path. To record the video, ffmpeg must be installed.

Additional folders

The results folder reports the logs of results from running the AMIGo model on different MiniGrid environments (reproduction)

License

The code in this repository is released under Creative Commons Attribution-NonCommercial 4.0 International License (CC-BY-NC 4.0).

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.idea		.idea
main		main
models		models
resources		resources
results		results
torchbeast/core		torchbeast/core
utils_folder		utils_folder
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
plot_util.py		plot_util.py
print_grid_util.py		print_grid_util.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning with AMIGo: Adversarially Motivated Intrinsic Goals

References

Installation

Running Experiments

Train AMIGo on MiniGrid

Train AMIGo with adaptive t* threshold

Train AMIGo with BeBold IR function

Test AMIGo on MiniGrid

Additional folders

License

About

Releases

Packages

Languages

License

allepalma/adversarially-motivated-intrinsic-goals

Folders and files

Latest commit

History

Repository files navigation

Learning with AMIGo: Adversarially Motivated Intrinsic Goals

References

Installation

Running Experiments

Train AMIGo on MiniGrid

Train AMIGo with adaptive t* threshold

Train AMIGo with BeBold IR function

Test AMIGo on MiniGrid

Additional folders

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages