forked from uber-research/deep-neuroevolution
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Felipe Such
committed
Apr 20, 2018
1 parent
59533d4
commit d99f92d
Showing
40 changed files
with
4,554 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,3 +6,4 @@ env/ | |
*.p | ||
mujoco/mjpro131 | ||
*.h5 | ||
gym_tensorflow.so |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
## AI Labs - GPU Neuroevolution | ||
This folder constains preliminary work done to implement GPU-based deep neuroevolution. | ||
For problems like Atari where the policy evaluation takes a considerable amount of time it is advantageous to make use of GPUs to evaluate the Neural Networks. This code shows how it is possible to run Atari simulations in parallel using the GPU in a way where we can evaluate neural networks in batches and have both CPU and GPU operating at the same time. | ||
|
||
This folder has code in prototype stage and still requires a lot of changes to optimize performance, maintanability, and testing. We welcome pull requests to this repo and have plans to improve it in the future. Although it can run on CPU-only, it is slower than our original implementation due to overhead. Once this implementation has matured we plan on distributing it as a package for easy installation. We included an implementation of the HardMaze, but the GA-NS implementation will be added later on. | ||
|
||
## Installation | ||
|
||
clone repo | ||
|
||
``` | ||
git clone https://github.com/uber-common/deep-neuroevolution.git | ||
``` | ||
|
||
create python3 virtual env | ||
|
||
``` | ||
python3 -m venv env | ||
. env/bin/activate | ||
``` | ||
|
||
install tensorflow or tensorflow-gpu > 1.2. | ||
``` | ||
pip install tensorflow-gpu | ||
``` | ||
Follow instructions under ./gym_tensorflow/README on how to compile the optimized interfaces. | ||
|
||
To train GA on Atari just run: | ||
``` | ||
python ga.py ga_atari_config.json | ||
``` | ||
Random search (It's a special case of GA where 0 individuals become parents): | ||
``` | ||
python ga.py ra_atari_config.json | ||
``` | ||
|
||
Evolution Strategies: | ||
``` | ||
python es.py es_atari_config.json | ||
``` | ||
|
||
Visualizing policies is possible if you install gym with `pip install gym` and run: | ||
``` | ||
python -m neuroevolution.display | ||
``` | ||
We currently have one example policy but more will be added in the future. | ||
|
||
## Breakdown | ||
|
||
* gym_tensorflow - Folder containing TensorFlow custom ops for Reinforcement Learning (Atari, Hard Maze). | ||
* moving away from python-based environments has significant speed ups on a multithreaded environment. | ||
* neuroevolution - folder containing source code to evaluate many policies simultaneously. | ||
* concurrent_worker.py - Improved implementation where each thread can evaluate a dynamic sized batch of policies at a time. Needs custom Tensorflow ops. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
{ | ||
"game": "frostbite", | ||
"model": "ModelVirtualBN", | ||
"num_validation_episodes": 30, | ||
"num_test_episodes": 200, | ||
"population_size": 5000, | ||
"timesteps": 250e6, | ||
"episode_cutoff_mode": 5000, | ||
"return_proc_mode": "centered_rank", | ||
"l2coeff": 0.005, | ||
"mutation_power": 0.02, | ||
"optimizer": { | ||
"args": { | ||
"stepsize": 0.01 | ||
}, | ||
"type": "adam" | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
{ | ||
"game": "frostbite", | ||
"model": "Model", | ||
"num_validation_episodes": 30, | ||
"num_test_episodes": 200, | ||
"population_size": 1000, | ||
"episode_cutoff_mode": 5000, | ||
"timesteps": 1.5e9, | ||
"validation_threshold": 10, | ||
"mutation_power": 0.002, | ||
"selection_threshold": 20 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
{ | ||
"game": "frostbite", | ||
"model": "Model", | ||
"num_validation_episodes": 30, | ||
"num_test_episodes": 200, | ||
"population_size": 1000, | ||
"episode_cutoff_mode": 5000, | ||
"timesteps": 1.5e9, | ||
"validation_threshold": 10, | ||
"mutation_power": 0.002, | ||
"selection_threshold": 0 | ||
} |
Oops, something went wrong.