Skip to content

Latest commit

 

History

History
86 lines (60 loc) · 3.39 KB

README.md

File metadata and controls

86 lines (60 loc) · 3.39 KB

Algorithms

This repository contains implementations of the following algorithms in a unified framework:

using standardized architectures and hyper-parameters, wherever applicable. If you want to add an algorithm, feel free to send a pull request.

Setup

We assume that you have access to a GPU with CUDA >=9.2 support. All dependencies can then be installed with the following commands:

conda env create -f setup/conda.yml
conda activate dmcgb
sh setup/install_envs.sh

You will also need to setup MuJoCo and DeepMindControl https://www.deepmind.com/open-source/deepmind-control-suite

Datasets

Part of this repository relies on external datasets. SODA uses the Places dataset for data augmentation, which can be downloaded by running

wget http://data.csail.mit.edu/places/places365/places365standard_easyformat.tar

If [Places] is unavailible there's also CoCo, which requires setting up an account with them.

Distracting Control Suite uses the DAVIS dataset for video backgrounds, which can be downloaded by running

wget https://data.vision.ee.ethz.ch/csergi/share/davis/DAVIS-2017-trainval-480p.zip

You should familiarize yourself with their terms before downloading. After downloading and extracting the data, add your dataset directory to the datasets list in setup/config.cfg.

The video_easy environment was proposed in PAD, and the video_hard environment uses a subset of the RealEstate10K dataset for background rendering. All test environments (including video files) are included in this repository, namely in the src/env/ directory.

Training & Evaluation

The scripts directory contains training and evaluation bash scripts for all the included algorithms. Alternatively, you can call the python scripts directly, e.g. for training call

python3 src/train.py \
  --algorithm <algorithm name> \
  --seed 0

You can see the parameter key for an algorithm under src/algorithms/factory.py.

To run AugCL we first suggest training an agent using the command

python3 src/train.py \
 -- id <id>
 --algorithm non_naive_rad
 --data_aug shift
 --train_steps 200k
 --save_buffer True

Then for the strong augmentation phase after the above process is completed run

python3 src/train.py \
 --algorithm augcl
 --data_aug splice
 --curriculum_step 200000
 --prev_id <id>
 --prev_algorithm non_naive_rad

The above will search logs folder for weights and stored replay buffer of a model matching the prev_id and prev_algorithm with matching seed. Seed can be set with --seed.

To evaluate run eval.py with the same arguments as train.py, supported arguments can be see in src/arguments.py