Laboratory for Zero Knowledge Discovery, The University of Chicago |
---|
Email: ishanu@uchicago.edu , Web: https://zed.uchicago.edu |
Run the INSTALL.sh
, which does:
- make sure output directory is writable
chmod +x ./bin/zcor
- User needs a linux environment to run the script.
- User needs to specify:
-data: the path to .dat file with patient data formatted per the instructions from Part 2.
-outfile: the path for the output .csv file to store the predictions generated by the script.
Example Command:
./bin/zcor -data INPUT.dat -outfile OUTPUT.csv
- The input data consists of a text file (.dat extension)
- Each line representing a unique patient:
- For each patient user needs to provide:
- patient_id (any string as long as it is unique)
- gender (M,F)
- birth date
- timestamped ICD codes from the patient history
- All the dates should be in "YYYY-MM-DD" format, example: 2005-01-29.
PATIENT_ID,GENDER,BIRTH_DATE,DATE,CODE,CODE,CODE,CODE,DATE,CODE,CODE
Example of patient record:
P05011029,M,1978-02-22,2008-08-11,611.8,788.89,2008-08-04,611.72,2008-11-19,739.1,Z83.6,2010-01-19,282.11,2011-04-14,R05
Note this means: male patient P05011029, was born on 1978-02-22, and on 2008-08-11 had the following diagnostic codes 611.8,788.89, and so on.
- Patients without target ICD codes must have at least 4 years of data (interval between the earliest and the latest code >= 208 weeks)
- Patients with target ICD code must have at least 3 years of data (interval between the earliest ICD code and the earliest target code >= 156 weeks)
The script handles the eligibility criteria by itself, and will notify the user of any patients who had not enough data for proper risk inference;
Example output:
WARNING: 1 out of 5 patients were excluded due to insufficient inference window