Contains analyses for hERG classification paper
A colab notebook is at
Latest notebook is at (as of Feb 1st, 2022)
feature_sets/ # contains feature data sets # for conditional probabilities # for ML strategies
python -run # generates plots for all data
python -nodisp -run prob # generates data from probability classifier, but without images (faster)
python -bootstrap -nodisp -run prob # bootstrapping
- Prob. classifier performance (F1 score etc) has a strong dependence on the cutoff used (defined in probUtil). Review the prod.png file to make sure a good value is selected
MD analysis is done with either cpptraj or tcl scripts that depends on the vmd and its associated packaged
all MD analysis for this project are done on the local gpu cluster (faust), in view of the amount of data
for using cpptraj:
we need two files:
- input file: this is used to load all the trajectories and the calculation to carryout. Below is an example input file ( used to calculate the dynamic cross correlations.
trajin 3atp-1.dcd 1 -1 50
rms 3atp-1.pdb
matrix correl @CA out 3atp-3mg.dat byres
Note: Output is written to 3atp-3mg.dat
- shell script: this is used to execute the input file. Below is an example.
cpptraj -p 3atp.prmtop -i
Note: Make sure cpptraj is installed (gpu enabled cpptraj to speed up the calculations)
for using tcl:
require 2 files:
- tcl script with all the information necessary to load all the trajectories, paramter, topology files and the variables to calculate
- bash script to execute the tcl file created above
Note: Ofcourse vmd has to be installed first to do the analysis using tcl scripts (use cuda enabled version for speed)