Skip to content

ki-analysis/manifold-ga

Repository files navigation

manifold-GA

Copyright (c) 2019 Russell Fung & Abbas Ourmazd. All Rights Reserved.

manifold-GA files & directory structure

  1. Download the following MATLAB® m-files, and place them in a directory named manifold-ga.
    1. dmat.m
    2. parse_string.m
    3. plotRF_integer_bar_chart.m
    4. update_preference.m
    5. manifold_GA.m
    6. manifold_GA_2_visits.m
    7. histcounts.m (for GNU Octave users)
    8. manifold_GA_batch_mode.m
  2. Inside the manifold-ga directory, create a subdirectory named trained_model.
  3. Inside the trained_model directory, MATLAB® mat-files for each trained model must be placed together in an appropriately named subdirectory. These mat-files can be downloaded here. These models have been trained using different subsets of the InterGrowth-21 data.
  4. The trained model 180403d_8145 is used by default, and must be present if no other trained model is explicitly specified.
  5. Each trained model consists of
    1. an appropriatedly named subdirectory.
    2. one NLSA-reconstructed time-series of (AC,FL,HC) in squeezed_reconstructed_data.mat.
    3. for each length-scale of interest (σ),
      • the diffusion map embedding of the time-series & normalization factors in training_info_nS*_nN*_sigma*_nEigs*.mat, and
      • the expansion coefficients in c_coeff_info_nS*_nN*_sigma*_nEigs*.mat
  6. For example, for the trained model 180403d_8145, we have
    1. a subdirectory named 180403d_8145, within which are
    2. one file named squeezed_reconstructed_data.mat, and
    3. twenty-two training_info_nS1713_nN300_sigma*_nEigs100.mat & c_coeff_info_nS1713_nN300_sigma*_nEigs100.mat files, for the 22 length-scales (σ) the model has been extended to cover.

running manifold-GA

  1. It has been noted that Synapse inserts '(*)' into some of the model filenames. This can be fixed by running
    >> fix_synapse_filename_bug
    Note that this has to be done each time new model files are downloaded from Synapse.

    Bug fixed by Synapse. This step is no longer necessary.

  2. One prenatal visit, with a GUI:
    >> manifold_GA
  3. One prenatal visit, without a GUI:
    >> T=[16.17 3.61 18.45]; manifold_GA
    Here the Abdominal Circumference (AC), Femur Length (FL), and Head Circumference (HC), in this order, are specified in cm in the MATLAB® variable T.
  4. A different trained model can be used, with or without a GUI:
    >> system_of_interest='180403i_9565'; manifold_GA
    >> system_of_interest='180402h_1732'; T=[16.17 3.61 18.45]; manifold_GA
    In these cases, the specified trained models must be present (see above).
  5. In each of these cases, a histogram of candidate predictions (with different values of σ and numbers of eigenfunctions) is shown, and a text message will appear in the MATLAB® window.
  6. If a definitive prediction is possible, the corresponding histogram bar will be highlighted in red.
  7. Two prenatal visits of the same subject, with a GUI:
    >> manifold_GA_2_visits
  8. Two prenatal visits of the same subject, without a GUI:
    >> T=[22.24 4.34 23.06; 29.21 5.53 28.70]; dt=35; manifold_GA_2_visits
    Here AC, FL, and HC for the first visit, and AC, FL, and HC for the second visit, in this order, are specified in cm in the MATLAB® variable T. Note the use of semi-colon, making T a 2 x 3 matrix. The time elapsed between the two visits is specified in days in the MATLAB® variable dt.
  9. A different trained model can be used, with or without a GUI:
    >> system_of_interest='180403i_9565'; manifold_GA_2_visits
    >> system_of_interest='180402h_1732'; T=[22.24 4.34 23.06; 29.21 5.53 28.70]; dt=35; manifold_GA_2_visits
    In these cases, the specified trained models must be present (see above).
  10. In each of these cases, a histogram of candidate predictions (with different values of σ and numbers of eigenfunctions, and interval matched to within 1 day of the best match) is shown, and a text message will appear in the MATLAB® window.
  11. The histogram bar corresponding to the prediction with the best interval match is highlighted.
  12. In both the one-visit and the two-visit cases, the predicted GA is returned in the MATLAB® variable predicted_GA.
  13. Batch processing. Prepare a csv file with at least five data columns (extra data columns are okay). The first row is taken as column headings and is ignored. In manifold_GA_batch_mode.m, enter the name of the csv file in line# 9 (data = csvread ('...');), and in lines# 15-19 enter the column numbers corresponding, respectively, to subject IDs, relative GA in days, AC, FL, and HC (all in cm). Run:
    >> manifold_GA_batch_mode
    GA is predicted for the latest visit of each subject whether or not the measurements for that visit are used in the analysis. Only measurements from visits within a goldilocks range of GA (default 17-33 weeks as determined by Eq (2) of Papageorghiou et al (2016)) are used in the analysis. If only one goldilocks visit is available, manifold_GA.m is called; otherwise, manifold_GA_2_visits.m is called with the first two goldilocks visits. The predicted GA (for the latest visit of each subject) will be saved in the csv file manifold_predicted_ga.csv where the first column contains the subject IDs, and the second column contains the predicted GAs.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages