-
Notifications
You must be signed in to change notification settings - Fork 28
Quick start
Ryan Wick edited this page Mar 16, 2023
·
26 revisions
In brief, the steps involved in getting a Trycycler consensus assembly are:
- Prepare the input files:
- Save your long reads as
reads.fastq
(gzipped reads also work). - Run Trycycler subsample to create multiple read subsets:
trycycler subsample --reads reads.fastq --out_dir read_subsets
- Assemble each of those subsets (ideally using a few different assemblers) to produce the input assemblies for Trycycler. These should all be very similar (because they are from the same genome) but not quite identical (because they are from different read subsets). Save them as
assemblies/*.fasta
. - Optionally, manually curate your input assemblies. E.g. look at them in Bandage to see which appear nice and complete (and are thus suitable for use in Trycycler) and which are fragmented (and should be thrown out).
- Save your long reads as
- Run Trycycler cluster to group similar contigs together:
trycycler cluster --assemblies assemblies/*.fasta --reads reads.fastq --out_dir trycycler
- Manually inspect the clusters to decide which are valid:
- For this example, we'll assume
cluster_001
,cluster_002
andcluster_003
are the good clusters which represent replicons for which we want a consensus. - Delete or rename all other cluster directories (so you can glob for the good clusters with
trycycler/cluster_*
).
- For this example, we'll assume
- Run Trycycler reconcile on each of the clusters:
trycycler reconcile --reads reads.fastq --cluster_dir trycycler/cluster_001
trycycler reconcile --reads reads.fastq --cluster_dir trycycler/cluster_002
trycycler reconcile --reads reads.fastq --cluster_dir trycycler/cluster_003
- For these commands to complete, it may be necessary to delete or repair some of the cluster sequences.
- If any clusters are not reconciling well, you can use
trycycler dotplot
to visualise how the sequences relate to each other, which can inform any interventions you need to take.
- Run Trycycler MSA on each of the clusters:
trycycler msa --cluster_dir trycycler/cluster_001
trycycler msa --cluster_dir trycycler/cluster_002
trycycler msa --cluster_dir trycycler/cluster_003
- Run Trycycler partition to divide up the reads:
trycycler partition --reads reads.fastq --cluster_dirs trycycler/cluster_*
- Run Trycycler consensus to make a consensus sequence for each contig cluster:
trycycler consensus --cluster_dir trycycler/cluster_001
trycycler consensus --cluster_dir trycycler/cluster_002
trycycler consensus --cluster_dir trycycler/cluster_003
- Combine all consensus sequences into a single FASTA:
cat trycycler/cluster_*/7_final_consensus.fasta > trycycler/consensus.fasta
For more information, please look at the wiki pages for each of the steps involved.
If you're new to Trycycler, I'd recommend trying it out on the Demo datasets to get some practice.
- Home
- Software requirements
- Installation
-
How to run Trycycler
- Quick start
- Step 1: Generating assemblies
- Step 2: Clustering contigs
- Step 3: Reconciling contigs
- Step 4: Multiple sequence alignment
- Step 5: Partitioning reads
- Step 6: Generating a consensus
- Step 7: Polishing after Trycycler
- Illustrated pipeline overview
- Demo datasets
- Implementation details
- FAQ and miscellaneous tips
- Other pages
- Guide to bacterial genome assembly (choose your own adventure)
- Accuracy vs depth