Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations

This is the code for reproducing the results of the paper Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations accepted at ICML'2022. The paper can be found here.

Usage

Paper results were collected with MuJoCo 1.50 (and mujoco-py 1.50.1.1) in OpenAI gym 0.17.0 with the D4RL datasets. Networks are trained using PyTorch 1.4.0 and Python 3.6.

The paper results can be reproduced by running:

./run_dwbc.sh

You can also run DWBC on the setting used in DemoDICE and SMODICE by running main_setting_demodice.py:

python main_setting_demodice.py \
   --algorithm="DWBC" \  
   --env_e="hopper-expert-v2" \
   --env_o="hopper-random-v2" \
   --num_e=1 \  # expert trajectory num in D_e
   --num_o_e=200 \  # expert trajectory num in D_o
   --num_o_o=2000 \  # non-expert trajectory num in D_o

Bibtex

@inproceedings{xu2022discriminator,
  title     = {Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations},
  author    = {Xu, Haoran and Zhan, Xianyuan and Yin, Honglei and Qin, Huiling},
  booktitle = {Proceedings of the 39th International Conference on Machine Learning},
  pages     = {24725-24742},
  year      = {2022},
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
algos		algos
LICENSE		LICENSE
README.md		README.md
get_dataset.py		get_dataset.py
main.py		main.py
main_setting_demodice.py		main_setting_demodice.py
run_dwbc.sh		run_dwbc.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations

Usage

Bibtex

About

Languages

License

ryanxhr/DWBC

Folders and files

Latest commit

History

Repository files navigation

Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations

Usage

Bibtex

About

Topics

Resources

License

Stars

Watchers

Forks

Languages