PyTorch implementation of VAE-GAM model for task-based fMRI analysis described in: https://proceedings.mlr.press/v149/albuquerque21a
This model provides a more flexible means to analyze task-based fMRI data stemming from designed experiments in cognitive, systems
and clinical neuroscience.
To get started using this code, clone this repository and make sure you have all the dependencies listed under dependencies.txt installed.
Scripts should allow users to:
1)Create required files for model training.
2)Create synthetic data for synthetic signal simulations demonstrated in paper.
3)Train VAE-GAM model on either synthetic or real data.
4)Generate latent space plots, plots for Gaussian Process regressors and single volume reconstructions,
as well as subject-level and group-level average maps for covariate, base and for
full-reconstruction maps.
Pre-processing scripts assume user has ALREADY performed some basic pre-processing on his/hers fMRI dataset using standard neuroimaging software (e.g., FSL). For the experiments showcased in paper, we utilized fmriprep for the preprocessing steps. Additional details on specific routines/parameters used for fMRI preprocessing can be provided upon request, along with the preprocessed nifti files we used to run the checker experiments/simulations.
To generate the csv_file to be used by DataClass and Loaders:
1)Run pre_proc_vaefmri.py
python pre_proc_vaefmri.py --data_dir {dir w/ your preprocessed fMRI data} --save_dir {dir where you want your csv_file saved to} --nii_file_pattern {your_nii_filename_pattern} --mot_file_pattern {your_mot_filename_pattern}
The last 2 flags refer to filename patterns for your pre-processed fMRI data (assumes nifti format!) and for the motion files generated during preprocessing.
User can also choose to use flag '--control' to generate csv_file for a control experiment simulation. In this case, user should pass intensity of synthetic control signal present in data using the --control_int flag. Finally, this program also adds either a 'TRAIN' or 'TEST' tag to the generated preprocessed filename, indicating if csv corresponds to train or test set. Default is 'TRAIN' - to generate csv for test set run above command with --set_tag as 'TEST'.
2)Run get_beta_map_regularizer.py.
This script will generate a rough (least-squares estimate) map using the preprocessed fMRI data and their corresponding design matrices (produced using std GLM software like FSL). This rough map is used as a regularizer, so as to encourage our model to produce maps that are not too far off from main effects expected using GLM approach. This regularizer was not needed when running control experiments, only for actual biological signals like V1 signal/experiments showcased in paper.
python get_beta_map_regularizer.py --root_dir {dir w/ pre-processed fMRI data} --output_dir {dir where you want the lsqrs map saved to.} --data_dims {x, y, z, time - these are the dimensions for your 4D fMRI data.}
3)Train model
python multsubj_reg_run_GP.py --train_csv {csv file for train set} --test_csv {csv file for test set} --save_dir {dir where model checkpoints, latent space plots, GP plots and
and reconstructions/maps will be saved to} --glm_maps {path to lstsqrs map generated in step 2 above.}
User can train model using a pre-existing checkpoint file. IF this behavior is desired simply add the flag --from_ckpt to the command above and give
location of checkpoint file to be used --ckpt_path {my_ckpt} .
User can also choose to simply generate latent space plots, GP plots and map reconstructions from a pre-existing checkpoint file (and without training model
further). If this is desired add --recons_only flag to command above, along with --from_ckpt flag and with location for
ckpt file to be used when creating model outputs --ckpt_path {my_ckpt} .
You may wish to change other model parameters such as weights for the gp KL regularization, weights for the lstsqrs map regularizer or number of inducing points for the GP regressors. You may do so
by changing values of --gp_kl_scale , --glm_reg_scale and --num_inducing_pts respectively. However, we do not advise doing so unless you fully understand rest of code.
Finally this script also takes a --neural_covariates flag, which indicates wether covariates passed are real/bilogical or synthetic. Default is 'True', meaning code will treat covariates passed as being real biologically-relevant signals, which will be convolved with HRF. Note that last 6 covariates passed in csv file are assumed to be motion-related nuisance covariates. These are NEVER convolved with the HRF (regardless of choice for this flag).
4)Adding synthetic signals to existing data.
To construct data sets with the synthetic signals shown in paper, run the following command:
python add_control_signal.py --root_dir {dir with preprocessed fMRI data we wish to add synthetic signal to} --intensity {intensity of added singal} --shape {'Large3'} --nii_file_pattern {filename pattern for nifti files under root_dir to be used.}
Of note, this script WILL NOT overwrite the data under --root_dir .
Instead, it will write data with synthetic signal to --root_dir with same name as original + suffix 'ALTERED_Large3_intensity_simple_ts_date_stamp.nii.gz'
For any questions regarding this repository, the paper, replicating our simulations or extending this work please contact Daniela de Albuquerque -- dfd4@duke.edu.