Skip to content

This repository contains code for running simulations to make design decisions for the Oralytics RL algorithm.

License

Notifications You must be signed in to change notification settings

StatisticalReinforcementLearningLab/oralytics_algorithm_design

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Oralytics Algorithm Design Decisions

This repository contains code for running simulations to make design decisions for the Oralytics RL algorithm. Full details can be found in Oralytics Algorithm Report or Appendix A of the Oralytics protocol paper.

Citing Our Code

If you use our data or code in any way, please cite us:

@article{nahum2024optimizing,
  title={Optimizing an adaptive digital oral health intervention for promoting oral self-care behaviors: Micro-randomized trial protocol},
  author={Nahum-Shani, Inbal and Greer, Zara M and Trella, Anna L and Zhang, Kelly W and Carpenter, Stephanie M and Ruenger, Dennis and Elashoff, David and Murphy, Susan A and Shetty, Vivek},
  journal={Contemporary Clinical Trials},
  volume={139},
  pages={107464},
  year={2024},
  publisher={Elsevier}
}
@misc{trella2024oralytics,
      title={Oralytics Reinforcement Learning Algorithm}, 
      author={Anna L. Trella and Kelly W. Zhang and Stephanie M. Carpenter and David Elashoff and Zara M. Greer and Inbal Nahum-Shani and Dennis Ruenger and Vivek Shetty and Susan A. Murphy},
      year={2024},
      eprint={2406.13127},
      archivePrefix={arXiv},
      primaryClass={id='cs.AI' full_name='Artificial Intelligence' is_active=True alt_name=None in_archive='cs' is_general=False description='Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.'}
}

Running Our Code

Simulation Environment

To make the final design decisions on the Oralytics algorithm for the main study of Oralytics, we used the Oralytics V2 simulation environment test bed to run simulations. V2 was built off of ROBAS 3 data and included simulated app opening behavior. Although not used in the final design phase, we include code for V1, a previous (now deprecated) version and V3, a version built off of Oralytics pilot data that was not needed for our simulations, but could still be useful for other research teams.

Code for building and evaluating teh V2 simulation test bed:

We describe the procedure for creating the V2 simulation environment test bed:

  1. Running python3 dev_scripts/sim_env_v1v2/v1v2_fitting_environment_models.py will fit each ROBAS 3 user to a base model class (hurdle with a square root transform model or zero-inflated poisson model) and save parameters associated with each model to csv files.

  2. Running python3 dev_scripts/sim_env_v1v2/v1v2_env_base_model_selection.py will take the csv files generated as described above in step 1 and will compute the RMSE of the model and the observed ROBAS 3 data and choose the base model class that yields a lower RMSE for that user. Outputs: stat_user_models.csv and non_stat_user_models.csv

  3. Running python3 dev_scripts/sim_env_v1v2/calculate_effect_sizes.py will take in stat_user_models.csv and non_stat_user_models.csv and calculate effect sizes to impute based on each environment's fitted parameters. Outputs are in the sim_env_data folder with suffix pattern effect_sizes.p.

  4. Running python3 dev_scripts/sim_env_v1v2/calculate_effect_sizes.py will take in stat_user_models.csv, non_stat_user_models.csv, and ROBAS 3 raw data and calculates standard effect sizes to compare with the imputed effect sizes created in step 3.

  5. Running python3 dev_scripts/sim_env_v3/app_opening_prob_calculation.py will take in Oralytics pilot data and calculate app opening probabilites for each participant in the pilot study. This value was used to determine the population-level app opening probability we imputed in the V2 test bed.

Running Experiments

Experiments can be run sequentially one at a time or in parallel.

To run experiments:

  1. Fill in the read and write path in read_write_info.py. This specifies what path to read data from and what path to write results to.
  2. In run.py, specify experiments parameters as instructed in the file. Example: specify the simulation environment variants and algorithm candidate properties. You must modify the JOB_TYPE field to specify what job to run. In addition, you must modify the DRYRUN field to specify running jobs in parallel or sequentially. DRYRUN = True runs jobs one after the other (this is a good practice to test out new code initially). Switch to DRYRUN = False to run experiments in parallel.

There are 4 types of jobs:

  • simulations: runs the main set of experiments for each algorithm candidate in each simulation environment variant
  • compute_metrics: computes desired metrics using outputs from simulations
  • hyper_tuning: runs the hyperparameter tuning for the reward parameters after finalizing the algorithm candidate
  • hyper_plots: computes hyperparameter grid plots using outputs from hyper_tuning

For each job type, there are commented out example lists called QUEUE. Please comment / uncomment to fit your job type but make sure only one QUEUE is uncommented at a time.

  1. Run python3 src/submit_batch on the cluster to submit jobs and run in parallel.

Fitting the Prior

For the Oralytics main study, we designed the prior based off of Oralytics pilot data and in discussion with domain experts. pilot_prior_formation.py is the code for calculating the statistics and plots which informed the design of the prior. For knowledge sharing, we have kept additional scripts in the folder prior formation which contains code using GEE analysis as a measure for significance testing.

Running Unit Tests

To run tests, you need to be in the root folder and then run for example python3 -m unittest test.test_rl_experiments if you want to run the test_rl_experiments.py file in the test folder.

About

This repository contains code for running simulations to make design decisions for the Oralytics RL algorithm.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published