- This pipeline performs de novo genome assembly using HiCanu and integrates with nf-core/pairgenomealign for downstream analysis.
- Nextflow
- Slurm environment
- Miniforge
- nf-core
-
Clone the repository:
git clone https://github.com/connor122721/obtusa cd obtusa
To run the de novo assembly pipeline, use the following command:
nextflow run main.nf -profile slurm
This command executes the main.nf
script using the Slurm profile defined in nextflow.config
. The pipeline will:
- Read HiFi reads from the specified FASTQ file.
- Perform genome assembly using HiCanu.
- Output the assembled genome to the
obtusa_hifi/genome
directory.
To download genomes using the download_NCBI
module, execute the modules/download.nf
script:
nextflow run modules/download.nf -profile slurm
This will download genomes based on the species list provided in params.species_list
in the nextflow.config
file and place the output in the obtusa_hifi/ncbi
directory.
After downloading the necessary genomes, you can run the nf-core/pairgenomealign pipeline for comparative genomics analysis.
nf-core/pairgenomealign requires a specific input format. Create a CSV file:
sample,fasta
sample1,/path/to/reference.fasta(or fna)
sample2,/path/to/reference.fasta(or fna)
Run the nf-core/pairgenomealign pipeline using the following command:
nextflow run nf-core/pairgenomealign \
-r 1.0.0 \
-profile slurm,apptainer \
--input samplesheet.csv \
--target obtusa_hifi/genome/obtusa_hifi/obtusa.contigs.fasta \
--outdir obtusa_hifi/ \
-c nextflow.config
After running the scaffolding software, longstitch, test the alignment again:
nextflow run nf-core/pairgenomealign \
-r 1.0.0 \
-profile slurm,apptainer \
--input samplesheet.csv \
--target obtusa_hifi/longstitch/obtusa_draft.k32.w100.tigmint-ntLink.longstitch-scaffolds.fa \
--outdir obtusa_hifi_scaffold/ \
-c nextflow.config
The nextflow.config
file contains various parameters that can be adjusted.