-
Notifications
You must be signed in to change notification settings - Fork 27
Profile long reads
Since mOTUs 3.0.3 it is possible to profile long reads with mOTUs.
Given a long read fasta file (let's say long_reads.fasta
), you need to first run:
motus prep_long -i long_reads.fasta -o converted_long_reads.fasta.gz
to split the long reads into shorter reads that can be profiled by a default mOTUs profile
call like:
motus profile -s converted_long_reads.fasta.gz
Note that the command prep_long
works on both fasta and fastq files, as well as on .gz files.
The latest version of mOTUs (3.0.3) can not be installed via conda yet (we are working on it!). It can be installed via pip
if the general mOTUs dependencies are installed before:
python -m pip install motu-profiler
Note that you have to install the following dependencies manually:
You can try to run mOTUs on a mock community. To download the long reads use the following command:
wget https://sunagawalab.ethz.ch/share/MOTUS_DATA/motus_3.0.3/motus_long_reads/HiFi-ATCC-MSA-1003.250k.fastq.gz
Note: This dataset represents the bigger dataset SRR9328980
which was subsampled to 10% of the original number of reads.
We first prepare the long read to run on mOTUs with:
motus prep_long -i HiFi-ATCC-MSA-1003.250k.fastq.gz -o HiFi-ATCC-MSA-1003.250k.short.fastq -no_gz
gzip HiFi-ATCC-MSA-1003.250k.short.fastq # or "pigz -p 32 HiFi-ATCC-MSA-1003.250k.short.fastq" if pigz is installed
# Or download the prepared result produced by the command:
wget https://sunagawalab.ethz.ch/share/MOTUS_DATA/motus_3.0.3/motus_long_reads/HiFi-ATCC-MSA-1003.250k.short.fastq.gz
Note: We compress the file manually due to performance issues with the python gzip
module.
Run mOTUs with:
# We use -A to be consistent with the report shown below. -A doesn't change the profile, just the report type
# -t defines the number of threads
motus profile -A -s HiFi-ATCC-MSA-1003.250k.short.fastq.gz -o HiFi-ATCC-MSA-1003.motus -t 32
# Or download the prepared result produced by the command:
wget https://sunagawalab.ethz.ch/share/MOTUS_DATA/motus_3.0.3/motus_long_reads/HiFi-ATCC-MSA-1003.motus
Explore the result:
# Get abundances for genus level
grep "g__" HiFi-ATCC-MSA-1003.motus | grep -v "s__"
k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales|f__Pseudomonadaceae|g__Pseudomonas 0.0256625687
k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales|f__Moraxellaceae|g__Acinetobacter 0.0041047739
k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Enterobacterales|f__Enterobacteriaceae|g__Escherichia 0.1690642823
k__Bacteria|p__Proteobacteria|c__Alphaproteobacteria|o__Rhodobacterales|f__Rhodobacteraceae|g__Rhodobacter 0.2692651120
k__Bacteria|p__Proteobacteria|c__Betaproteobacteria|o__Neisseriales|f__Neisseriaceae|g__Neisseria 0.0016740789
k__Bacteria|p__Proteobacteria|c__Epsilonproteobacteria|o__Campylobacterales|f__Helicobacteraceae|g__Helicobacter 0.0019215690
k__Bacteria|p__Firmicutes|c__Clostridia|o__Clostridiales|f__Clostridiaceae|g__Clostridium 0.0065334779
k__Bacteria|p__Firmicutes|c__Bacilli|o__Bacillales|f__Bacillaceae|g__Bacillus 0.0208843590
k__Bacteria|p__Firmicutes|c__Bacilli|o__Bacillales|f__Staphylococcaceae|g__Staphylococcus 0.0725363474
k__Bacteria|p__Firmicutes|c__Bacilli|o__Lactobacillales|f__Streptococcaceae|g__Streptococcus 0.1647341426
k__Bacteria|p__Firmicutes|c__Bacilli|o__Lactobacillales|f__Lactobacillaceae|g__Lactobacillus 0.0009095220
k__Bacteria|p__Deinococcus-Thermus|c__Deinococci|o__Deinococcales|f__Deinococcaceae|g__Deinococcus 0.0013104470
k__Bacteria|p__Actinobacteria|c__Actinobacteria|o__Propionibacteriales|f__Propionibacteriaceae|g__Cutibacterium 0.0035726154
k__Bacteria|p__Bacteroidetes|c__Bacteroidia|o__Bacteroidales|f__Porphyromonadaceae|g__Porphyromonas 0.1891746967
# Get abundances starting for mOTU/species level
grep "s__" HiFi-ATCC-MSA-1003.motus
k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Enterobacterales|f__Enterobacteriaceae|g__Escherichia|s__Escherichia coli [ref_mOTU_v3_00095] 0.1690642823
k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales|f__Pseudomonadaceae|g__Pseudomonas|s__Pseudomonas aeruginosa [ref_mOTU_v3_00201] 0.0256625687
k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales|f__Moraxellaceae|g__Acinetobacter|s__Acinetobacter baumannii [ref_mOTU_v3_00259] 0.0041047739
k__Bacteria|p__Firmicutes|c__Bacilli|o__Bacillales|f__Bacillaceae|g__Bacillus|s__Bacillus sp. [ref_mOTU_v3_00329] 0.0208843590
k__Bacteria|p__Firmicutes|c__Bacilli|o__Bacillales|f__Staphylococcaceae|g__Staphylococcus|s__Staphylococcus aureus [ref_mOTU_v3_00340] 0.0053912499
k__Bacteria|p__Firmicutes|c__Bacilli|o__Bacillales|f__Staphylococcaceae|g__Staphylococcus|s__Staphylococcus epidermidis [ref_mOTU_v3_00346] 0.0671450975
k__Bacteria|p__Actinobacteria|c__Actinobacteria|o__Propionibacteriales|f__Propionibacteriaceae|g__Cutibacterium|s__Cutibacterium acnes [ref_mOTU_v3_00800] 0.0035726154
k__Bacteria|p__Proteobacteria|c__Epsilonproteobacteria|o__Campylobacterales|f__Helicobacteraceae|g__Helicobacter|s__Helicobacter pylori [ref_mOTU_v3_00897] 0.0019215690
k__Bacteria|p__Bacteroidetes|c__Bacteroidia|o__Bacteroidales|f__Porphyromonadaceae|g__Porphyromonas|s__Porphyromonas gingivalis [ref_mOTU_v3_00985] 0.1891746967
k__Bacteria|p__Firmicutes|c__Bacilli|o__Lactobacillales|f__Lactobacillaceae|g__Lactobacillus|s__Lactobacillus gasseri [ref_mOTU_v3_01039] 0.0009095220
k__Bacteria|p__Proteobacteria|c__Alphaproteobacteria|o__Rhodobacterales|f__Rhodobacteraceae|g__Rhodobacter|s__Rhodobacter sphaeroides/johrii [ref_mOTU_v3_01513] 0.2692651120
k__Bacteria|p__Proteobacteria|c__Betaproteobacteria|o__Neisseriales|f__Neisseriaceae|g__Neisseria|s__Neisseria meningitidis [ref_mOTU_v3_01539] 0.0016740789
k__Bacteria|p__Firmicutes|c__Bacilli|o__Lactobacillales|f__Streptococcaceae|g__Streptococcus|s__Streptococcus mutans [ref_mOTU_v3_01605] 0.1567639053
k__Bacteria|p__Firmicutes|c__Bacilli|o__Lactobacillales|f__Streptococcaceae|g__Streptococcus|s__Streptococcus agalactiae [ref_mOTU_v3_01860] 0.0079702373
k__Bacteria|p__Deinococcus-Thermus|c__Deinococci|o__Deinococcales|f__Deinococcaceae|g__Deinococcus|s__Deinococcus radiodurans [ref_mOTU_v3_02207] 0.0013104470
k__Bacteria|p__Firmicutes|c__Clostridia|o__Clostridiales|f__Clostridiaceae|g__Clostridium|s__Clostridium beijerinckii [ref_mOTU_v3_03007] 0.0065334779