3CAC is a three-class classifier designed to classify contigs in mixed metagenome assemblies as phages, plasmids, chromosomes, or uncertain.
3CAC generates its initial classification by existing classifiers: viralVerify, PPR-Meta, PlasClass, and deepVirFinder. Thus, prior to running 3CAC, installation of these tools are required. Note that, user can install either viralVerify or PPR-Meta as prefered. 3CAC doesn't require to install both of them. Installation of PlasClass and DeepVirFinder are required.
- viralVerify (https://github.com/ablab/viralVerify)
- PPR-Meta (https://github.com/zhenchengfang/PPR-Meta)
- PlasClass (https://github.com/Shamir-Lab/PlasClass)
- DeepVirFinder (https://github.com/jessieren/DeepVirFinder)
To run 3CAC, please download the 3CAC
folder. 3CAC is written in Java and requires Java Runtime Environment.
3CAC requires the following input files:
(1) Contig file in "fasta" format: a set of contigs to be classified.
(2) Assembly grah file in "gfa" format: the assembly graph generated by metaSPAdes or metaFlye when assembling reads to generate the input contigs.
(3) A path file has path information for each contig, such as scaffolds.path
in metaSPAdes assembly and assembly_info.txt
in metaFlye assembly.
For contigs assembled from short reads by metaSPAdes, files scaffolds.fasta
, assembly_graph_with_scaffolds.gfa
, and scaffolds.path
can be used as input.
for contigs assembled from long reads by metaFlye, files assembly.fasta
, assembly_graph.gfa
, assembly_info.txt
can be used as input.
(1) Run either viralVerify
or PPR-Meta
on the contig file to classify each of the input contigs as phage, plasmid, chromosome, or uncertain.
(2) Generate files phageContigs.fasta
and plasmidContigs.fasta
containing contigs classified as phages and plasmids by step (1).
java PhageAndPlasmidContigs --output output_directory --contig contig_file.fasta --PPRMeta(or --viralVerify) output_file_of_PPRMeta_or_viralVerify.csv
(3) Run PlasClass
on plasmidContigs.fasta
and run DeepVirFinder
on phageContigs.fasta
.
Generate classification result of 3CAC.
java Classify3CAC --assembler Flye/SPAdes --output output_directory --graph assembly_graph_file.gfa --path scaffolds.path/assembly_info.txt --PPRMeta(or --viralVerify) output_file_of_PPRMeta_or_viralVerify.csv --PlasClass output_file_of_PlasClass.probs.out --deepVirFinder output_file_of_deepVirFinder.txt
A small test dataset could be found under the test
folder.
(1) To generate classification result of 3CAC based on viralVerify solution.
java Classify3CAC --assembler Flye --output ./test/ --graph ./test/assembly_graph.gfa --path ./test/assembly_info.txt --viralVerify ./test/assembly_viralVerify.csv --PlasClass ./test/viralVerify_plasmidContigs_plasClass.fasta.probs.out --deepVirFinder ./test/viralVerify_phageContigs_deepVirFinder.txt
(2) To generate classification result of 3CAC based on PPR-Meta solution.
java Classify3CAC --assembler Flye --output ./test/ --graph ./test/assembly_graph.gfa --path ./test/assembly_info.txt --PPRMeta ./test/assembly_PPRMeta.csv --PlasClass ./test/PPRMeta_plasmidContigs_plasClass.fasta.probs.out --deepVirFinder ./test/PPRMeta_phageContigs_deepVirFinder.txt
In case of any questions or suggestions please feel free to contact lianrong.pu@gmail.com