This repository contains code for running simulations to make design decisions for the Oralytics RL algorithm. Full details can be found in Oralytics Algorithm Report or Appendix A of the Oralytics protocol paper.
If you use our data or code in any way, please cite us:
@article{nahum2024optimizing,
title={Optimizing an adaptive digital oral health intervention for promoting oral self-care behaviors: Micro-randomized trial protocol},
author={Nahum-Shani, Inbal and Greer, Zara M and Trella, Anna L and Zhang, Kelly W and Carpenter, Stephanie M and Ruenger, Dennis and Elashoff, David and Murphy, Susan A and Shetty, Vivek},
journal={Contemporary Clinical Trials},
volume={139},
pages={107464},
year={2024},
publisher={Elsevier}
}
@misc{trella2024oralytics,
title={Oralytics Reinforcement Learning Algorithm},
author={Anna L. Trella and Kelly W. Zhang and Stephanie M. Carpenter and David Elashoff and Zara M. Greer and Inbal Nahum-Shani and Dennis Ruenger and Vivek Shetty and Susan A. Murphy},
year={2024},
eprint={2406.13127},
archivePrefix={arXiv},
primaryClass={id='cs.AI' full_name='Artificial Intelligence' is_active=True alt_name=None in_archive='cs' is_general=False description='Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.'}
}
To make the final design decisions on the Oralytics algorithm for the main study of Oralytics, we used the Oralytics V2 simulation environment test bed to run simulations. V2 was built off of ROBAS 3 data and included simulated app opening behavior. Although not used in the final design phase, we include code for V1, a previous (now deprecated) version and V3, a version built off of Oralytics pilot data that was not needed for our simulations, but could still be useful for other research teams.
Code for building and evaluating teh V2 simulation test bed:
- Fitting parameters for the simualtion environment base model: v1v2_fitting_environment_models.py
- Selecting best environment model for each user: v1v2_env_base_model_selection.py
- Calculating realistic effect sizes to impute: calculate_effect_sizes.py
- Computing statistics and checking how reasonable imputed effect sizes are: effect_size_check.py
- Calculating a population-level app opening probability using Oralytics pilot data: app_opening_prob_calculation.py
We describe the procedure for creating the V2 simulation environment test bed:
-
Running
python3 dev_scripts/sim_env_v1v2/v1v2_fitting_environment_models.py
will fit each ROBAS 3 user to a base model class (hurdle with a square root transform model or zero-inflated poisson model) and save parameters associated with each model to csv files. -
Running
python3 dev_scripts/sim_env_v1v2/v1v2_env_base_model_selection.py
will take the csv files generated as described above in step 1 and will compute the RMSE of the model and the observed ROBAS 3 data and choose the base model class that yields a lower RMSE for that user. Outputs: stat_user_models.csv and non_stat_user_models.csv -
Running
python3 dev_scripts/sim_env_v1v2/calculate_effect_sizes.py
will take in stat_user_models.csv and non_stat_user_models.csv and calculate effect sizes to impute based on each environment's fitted parameters. Outputs are in the sim_env_data folder with suffix patterneffect_sizes.p
. -
Running
python3 dev_scripts/sim_env_v1v2/calculate_effect_sizes.py
will take in stat_user_models.csv, non_stat_user_models.csv, and ROBAS 3 raw data and calculates standard effect sizes to compare with the imputed effect sizes created in step 3. -
Running
python3 dev_scripts/sim_env_v3/app_opening_prob_calculation.py
will take in Oralytics pilot data and calculate app opening probabilites for each participant in the pilot study. This value was used to determine the population-level app opening probability we imputed in the V2 test bed.
Experiments can be run sequentially one at a time or in parallel.
To run experiments:
- Fill in the read and write path in read_write_info.py. This specifies what path to read data from and what path to write results to.
- In run.py, specify experiments parameters as instructed in the file. Example: specify the simulation environment variants and algorithm candidate properties. You must modify the
JOB_TYPE
field to specify what job to run. In addition, you must modify theDRYRUN
field to specify running jobs in parallel or sequentially.DRYRUN = True
runs jobs one after the other (this is a good practice to test out new code initially). Switch toDRYRUN = False
to run experiments in parallel.
There are 4 types of jobs:
simulations
: runs the main set of experiments for each algorithm candidate in each simulation environment variantcompute_metrics
: computes desired metrics using outputs fromsimulations
hyper_tuning
: runs the hyperparameter tuning for the reward parameters after finalizing the algorithm candidatehyper_plots
: computes hyperparameter grid plots using outputs fromhyper_tuning
For each job type, there are commented out example lists called QUEUE
. Please comment / uncomment to fit your job type but make sure only one QUEUE
is uncommented at a time.
- Run
python3 src/submit_batch
on the cluster to submit jobs and run in parallel.
For the Oralytics main study, we designed the prior based off of Oralytics pilot data and in discussion with domain experts. pilot_prior_formation.py is the code for calculating the statistics and plots which informed the design of the prior. For knowledge sharing, we have kept additional scripts in the folder prior formation which contains code using GEE analysis as a measure for significance testing.
To run tests, you need to be in the root folder and then run for example python3 -m unittest test.test_rl_experiments
if you want to run the test_rl_experiments.py
file in the test folder.