Deep learning (DL) training is nondeterministic: both the DL algorithm and DL software implementation introduce nondeterminism to improvetraining efficiency and model accuracy. Prior work shows that both types of nondeterminismcause significant variance of model accuracy (upto 10.8%) and training time between identicaltraining runs. Such variance may affect the validity of new DL techniques proposed in the research community and the validity of the comparison results with baselines. To ensure such validity, DL researchers and practitioners must replicate their experiments multiple times withidentical settings to quantify the variance of the proposed approaches and baselines. Replicatingand measuring DL variances reliably and efficiently is challenging and understudied. We propose a ready-to-deploy framework DEVIATE that (1) measures DL training variance ofa DL model with minimal manual efforts, and (2) provides statistical tests of both accuracy andvariance. Specifically, DEVIATE automatically analyzes the DL training code and extracts monitored important metrics (such as accuracy and loss). In addition, DEVIATE performs popular statistical tests and provides users with a report of statistical p-values and effect sizes along with various confidence levels when comparing to selected baselines.
We have a demo video of the tool where we show some usage of the DEVIATE on the sample project.
DEVIATE requires the following:
- System with Linux OS, preferably Ubuntu 16.04 or later
- Python 3.6 or conda enviroment as specified by deviate.yml
- Docker with GPU support
- Experiment source code written in python
This is the adversarial training project described in the paper. The source code is under sample_project/fast_cifar10. To create the docker image to run this experiment:
- Download the three parts of the tar file that contain the Docker image from these links part1, part2, and part3
cd
to the download folder- Merge the part using
cat cifar10_challenge.tar.parta* > cifar10_challenge.tar
- Load the image using
docker load --input cifar10_challenge.tar
Steps to start using DEVIATE
- Activate the deviate conda enviroment if applicable (the one created using deviate.yml)
- Start DEVIATE using
python deviate.py
from DEVIATE folder. If using the conda enviroment then activate the enviroment first. - In the first start DEVIATE will ask for a working directory to save all DEVIATE's data including configuration files, replicated runs' data, and any analysis results.
- DEVIATE will display any existing experiments and give the user option
- a: Add a new experiment
- v: View existing experiment
- c: Perform comparision if there are more than 1 experiment
- n: Exit DEVIATE
To add a new experiment DEVIATE requires the following information. We give example based on the sample adversarial training project described earlier and $DEVIATE_HOME is the location of Deviate source folder.
- id: An unique ID of the experiment that would be used in analysis later. User should choose this id to be descriptive enough to differentiate the experiments.
- description: A short description of the experiment
- no_tries: The number of replication of the experiment
- source_dir: The full absolute path to the source directory that contain the experiment code (e.g.,
$DEVIATE_HOME/sample_project/fast_cifar10
) - train_file: The training source code file that DEVIATE can analyze and extract metrics to be monitored (e.g.,
train.py
) - train_command: The command to execute the training (e.g.,
python train.py RN32 FGSM
to train Resnet32 with FGSM attack) - eval_file: An optional evaluation source code file that can evaluate the trained models
- eval_command: A list of optional evaluation commands to evaluate the trained models
- docker_env: The docker image that can be used to execute the training and evaluation (e.g.,
cifar10_challenge
)
The added experiment will appear in the list of existing experiments.
- Select to view the details of the new experiment
- DEVIATE give users some options:
- v: View the progress of the run if the runs are not completed yet
- e: Edit the experiment (e.g., add more replication runs, change/add experiment commmand)
- x: Extract the metrics and modify the source code
- s: Shedule the experimental runs
- a: Analyze the variance of the experiment
- n: Exit to main menu
- For a newly created experiment, the user should select x to extract the metrics amd modify the source code. Follow the prompts to inspect the extracted metrics and make appropriate modification if necessary.
- Once the source code files are modified, the user should select s to schedule the runs.
- The user could monitor the status of the run using the v option.
DEVIATE can peform variance analysis of a single experinence
- Once the runs are completed, the user could perform variance analysis using the a option in the experiment view. DEVIATE will inform the user with the location of the analysis result once this is done.
DEVIATE also performs comparision between different experiments and provide statistical test to comfirm if the comparision are significant.
- To do this, select c in the main menu to perform comparision
- To add a new comparision DEVIATE requires the following information:
- id: An unique ID of the comparision.
- description: A short description of the comparision
- experiment_1: The id of the first experiment
- experiment_2: The id of the second experiment
- To perform the comparision, select c and then index of the comparision, DEVIATE will inform the user with the location of the comparision result