-
Notifications
You must be signed in to change notification settings - Fork 2
WORKFLOW: LSARP Genomics
Rauf Salamzade edited this page Dec 22, 2020
·
1 revision
The LSARP ResistanceDB Genomics workflow provides tons of genomic processing and basic analytical functionalities. Parameters can be configured for different pathogen species.
Google Sheet with description of final results from workflow found in the LSARP_Results/
subdirectory for each sample: https://docs.google.com/spreadsheets/d/15wZwNq5UKMRTBj7sm6UsUt-KA9y-3PFQk_jiTBdr6QI/
Parameter Identifier | Parameter Value Type / Default | Parameter Description |
---|---|---|
run_adaptertrim | Boolean. True | Whether to run adapter trimming with TrimGalore |
trimgalore_options | String. | Options for TrimGalore for adapter trimming of FASTQs. |
run_qualitytrim | String. False | Options |
trimmomatic_options | String. LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 | Options for running Trimmomatic for quality trimming of FASTQs. |
run_store_input | Boolean. True | Whether to store processed FASTQ files after quality-based and adapter trimming. |
run_centrifuge | Boolean. True | Whether to run Centrifuge. |
centrifuge_index | String. | Path to the Centrifuge database index. |
run_mlst_ariba | Boolean. False | Whether to run ARIBA for MLST analysis. |
mlst_ariba_db | String. | Path to the ARIBA MLST reference database. |
amr_ariba_card_db | String. | Path to the ARIBA CARD reference database. |
other_ariba_db_paths | String. | Path to other ARIBA reference database(s). Multiple ones should be separated by space. |
other_ariba_db_names | String. | Names of other ARIBA reference database(s). Multiple ones should be separated by space and be in the respective order provided for other_ariba_db_paths . |
run_straingst | Boolean. True | Whether to run StrainGST analysis to find closest strain in sample's respective genus. |
straingst_db | String. | Path to the StrainGST *.hdf5 reference database of k-mer profiles for representative strains. |
run_pilon | Boolean. False | Whether to run Pilon variant calling against a reference. |
reference_fasta | String. | Path to the reference FASTA. Requires bwa index to have been run in the same directory on the reference.
|
run_subsample_for_assembly | Boolean. False | Whether to run read subsampling for assembly. |
read_subsampling | Integer. 1000000 | The number of reads to subsample for assembly. Should correspond to around 100X coverage. |
run_assembly | Boolean. True | Whether to construct assembly. |
unicycler_flag | Boolean. True | Whether to use Unicycler wrapper for Illumina only assembly or SPAdes assembler directly. |
spades_read_length | Integer. 150 | The length of FASTQ reads to inform the SPAdes assembler. Only used if SPAdes assembler is used directly. |
assembly_threads | Integer. 4 | The number of cores/threads to provide for Illumina assembly. |
assembly_memory | Integer. 16 | The memory (in Gb per core/thread) to provide for Illumina assembly. |
assembly_timelimit | String. 48:00:00 | The time limit for running Illumina assembly. |
gaemr_formatter_options | String. -g 1 -c 100 -r | Options for running GAEMR formatting/preparation for QC analysis. |
gaemr_qc_options | Sting. --force --analyze_rna | Options for running GAEMR assembly QC analysis. |
run_cleanup | Boolean. True | Delete intermediate files: True/False |