Here you will find some util scripts useful for comparative genomics analysis.
- Obtaining the core and pangenome from genomes assemblies wiht
-Pipeline uses external dependencies:
Prokka ( It uses prokka for local gen prediction and annotation.
get_homologues ( It uses get_homologues for orthologs obtaining and comparison for getting core and pan-genome matrices.
Make sure that these programs are installed and are in your system's executable search path. To test, in a terminal type:
prokka -h -h
Download Comparative_genomics
git clone
Make executable all scripts
cd Comparative_genomics/scripts
chmod +x *.*
Obtaining core and pangenome matrix:
usage: ./ extention_file num_cpus path_to_comparative_genomics_scripts
This will result in a folder with all core genome genes from analyzed genomes.
- Average Aminoacid Identity (AAI) Matrix using sigle copy orthologous genes.
-External dependencies
enveomics/aai.rb ( For AAI pairwise comparisson.
-For obtaining single copy orthologous you can use the following bash script.
usage: ./ tmp number_of_genomes_used_in_get_homologues output_dir_for_single_copy_genes
-To oabtain the single copy orthologs genes for each genome use
usage: /home/avera/bin/Comparative_genomics/scripts/ list
-AAI comparisson:
usage: ./ core_gen_extention_file cpu path_to_enveomics_scripts
-Matrix generation
usage: ./ list
For a Heatmap we can use the following simple R code:
table <- read.table(file="matrix_aai_core.csv", header=TRUE, sep=",")
mat_dat <- data.matrix(table[,2:ncol(table)])
rnames <- table[,1]
rownames(mat_dat) <-rnames
heatmap.2(as.matrix(mat_dat)), cellnote=round(as.matrix(mat_dat),2), main="AAI core genome.", notecol="black","none", trace="none", dendrogram="row", margins=c(25,30), lhei = c(1,5))