– a reproducible and scalable workflow for prokaryote genomic data analysis designed for the genera Enterococcus spp.
JAMIRA is a Bioinformatics Workflow for Integrative Exploration of Genomic Features of bacterias including:
- Virulence factors (ABRICATE);
- Resistome profile (RGI);
- Plasmid prediction (ABRICATE);
- Prophage prediction (IslandPath);
- Genomic Islands prediction (Phispy);
To run JAMIRA you need to install Conda (prerequisites).
JAMIRA Workflow is intended to be executed in a Conda environment to ensure data reproducibillity and modularization among different genomic tools used in this pipeline. Thus, for each tool an isolated Conda environment was created, in which it encapsulates all the software dependencies necessary for execution.
Python version 3.7 is recommended.
Note: this tutorial was done using the Linux operating system. We believe that the same steps can be reproduced on macOS.
After complete Conda installation you need to add the necessary files present in this github in your conda folder. Follow the steps:
- Add the bioconda channel with the following commands:
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
- Create a conda environment for JAMIRA with the following command:
conda env create -f envs/config.yaml -n jamira
- Activate your JAMIRA environment:
conda activate jamira
- Install SnakeMake:
mamba install -c conda-forge -c bioconda snakemake
- Install rename, if you don't have yet:
sudo apt install rename
Congratulations! The JAMIRA workflow is ready to be used!
Complete JAMIRA workflow can be executed with a single concise command line call.
snakemake --use-conda
- Addionally the user can run the complete workflow specyfing the number of cores (e.g 4 cores):
snakemake -j 4 --use-conda
After completing the workflow execution, the pipeline provides an option to generate a summary web report in HTML format. The interactive report can be generated with the following command call:
snakemake -n --report myresults.html
You can create a Direct Acyclic Graph (DAG) representation to visualize all steps executed with jamira workflow
- Display your DAG representation:
snakemake -j 4 --use-conda -n --dag | dot -Tsvg | display
- Save your DAG representation in SVG format:
snakemake -j 4 --use-conda -n --dag | dot -Tsvg > dag.svg
JAMIRA incorporate a collection of modules for specific data analysis tasks commonly applied in comparative genomic studies, such as: (i) virulence genes identification; (ii) antimicrobial resistance genes identification; (iii) plasmid sequences prediction; (iv) genomic islands prediction and (v) prophage prediction.
The genomic islands prediction module searches large segments of exogenous DNA inserted into bacterial genomes, well known as genomic islands (GIs), frequently associated with particular adaptations of microbes that are of medical, agricultural, or environmental importance.
- In this workflow we use IslandPath-DIMOB:
IslandPath-DIMOB is a standalone software to predict genomic islands in bacterial and archaeal genomes based on the presence of dinucleotide biases and mobility genes.
Bertelli and Brinkman, 2018
Hsiao et al., 2005
The prophage prediction module searches for mobile elements, responsible for carrying and disseminate virulence factors and antimicrobial resistance genes between bacteria.
- In this workflow we use PhiSpy to identify the most likely prophage regions in Bacterial genomes.
The antimicrobial resistance identification module enables the prediction of complete resistome profiles from genomic data.
- In this workflow we use RGI to predict resistomes based on homology and SNP models.
The plasmid prediction module searches for well-known replicon sequences to detect related plasmids that are often associated with antimicrobial resistance in clinically relevant bacterial pathogens.
- In this workflow we use ABRICATE to perform a BLAST against a curated database of plasmid sequences, PlasmidFinder database (Carattoli et al., 2014).
Seemann, 2018 Carattoli et al., 2014
The plasmid prediction module searches for well-known replicon sequences to detect related plasmids that are often associated with antimicrobial resistance in clinically relevant bacterial pathogens.
- In this workflow we use ABRICATE to perform a BLAST against a curated database of virulence factors related to bacterial pathogens, VFDB database (Chen et al., 2016).
Seemann, 2018 Chen et al., 2016
- Ícaro Castro - Workflow development - Github
- Rafaella Bueno - Web server development - Github
- Robson Ruiz - Web server development - Github
- Learn more about our projects: Enteromar Group