Computational Health Laboratory project for the a.y. 2021/2022.
Starting from one or more genes, extract from interaction databases the genes they interact with. Using the expanded gene set, perform pathway analysis and obtain all disease pathways in which the genes appear. Merge the pathways to obtain a larger graph. Perform further network analysis to extract central biomarkers and communities beyond pathways. Compute a distance between the initial gene set and the various pathways (diseases).
📂ComputationalHealthLaboratory
├── 💼0_Pathway_Enrichment.ipynb # Pathway gene dataset expansion and pathway enrichment
├── 💼1_Network_Analysis.ipynb # Network building and analysis
├── 💼2_Community_Analysis.ipynb # Community detection and analysis
├── 💼3_Plots.ipynb # Methods to plot the protein, disease and community graphs
├── 💼4_Project_CHL.ipynb # Entire project, the previous four notebooks combined
├── 📄config_example.yml # Replace this with your customized configuration file
├── 📄config.py # Method to retrieve data from BioGRID
├── 📂datasets # Datasets used by the project
│ ├── 🗃️BIOGRID.tab3.txt # The starting gene interactions used for our analysis
│ ├── 🗃️BIOGRID_updated.tab3.txt # The updated starting gene interactions
│ ├── 🗃️biomarkers.csv # Central nodes
│ ├── 🗃️communities.csv # Communities of the protein-to-protein graph
│ ├── 🗃️communities_metrics.csv
│ ├── 🗃️community_gene_metrics.csv
│ ├── 🗃️diseases_pathways.csv # Disease pathways retrieved from DisGeNET
│ ├── 🗃️diseases_scores.csv # Disease pathways with their metrics
│ ├── 🗃️genes.csv # Expanded gene dataset
│ ├── 🗃️geneset.csv # Starting gene interactions, retrieved by BioGRID
│ ├── 🗃️interactions.csv # Expanded gene interactions dataset
│ ├── 🗃️mean_distances.csv
│ └── 🗃️protein_graph.gpickle # Protein-to-protein graph
├── 📂presentation # Project final presentation
│ ├── 📄DallaNoceRistoriZuppolini_presentation.pdf
│ └── 📄DallaNoceRistoriZuppolini_presentation.pptx
├── 📄README.md
├── 📂report # Project report files
│ ├── 📄DallaNoceRistoriZuppolini_report.pdf # Project report
│ └── 📄... # Other Latex files for the report
├── 📄requirements.txt
└── 📂src # Project methods
├── 📄communities.py
├── 📄disease.py
├── 📄plot_graphs.py
├── 📄protein_to_protein_graph.py
└── 📄utilities.py
First, clone the repo
git clone https://github.com/nikodallanoce/ComputationalHealthLaboratory
Install all the required packages
pip install -r requirements.txt
Then you can work with the notebooks and our package, for a deeper understanding of our work, use 4_Project_CHL.ipynb to run the entire project, we strongly advise to change the protein's name with one of your choice or you can try with the same one worked with.