From 81bc12be514e29c38e7c67e98e55e63fa292f751 Mon Sep 17 00:00:00 2001 From: Alessio Milanese Date: Tue, 27 Sep 2022 10:15:50 +0200 Subject: [PATCH] improve organisation of README --- README.md | 51 ++++++++++++++++++++++++++++----------------------- 1 file changed, 28 insertions(+), 23 deletions(-) diff --git a/README.md b/README.md index 800ebcf..fbb9137 100644 --- a/README.md +++ b/README.md @@ -1,42 +1,42 @@ -Download genomes used for mOTUs 3 +Download mOTUs 3 genomes ======== -This tool downloads the 700,000 genomes used for the mOTUs 3 database. +This tool allows to download all or any of the 700,000 genomes used for the mOTUs 3 database. The user can download a specific genome (type), all genomes associated with a specific mOTU, or the complete database. + +## Installation +To run the script, clone this repository -First, clone this repository ``` git clone https://github.com/motu-tool/motus_v3_genomes cd motus_v3_genomes ``` +## Downloading genomes + +The data will be downloaded into the same folder where the script is located, under a folder with the name `motus_v3_genomes`. + +The structure is as follows: +- Within `motus_v3_genomes` there is a folder for each mOTU +- Within each mOTU folder are the genomes associated with this mOTU. In addition, there is a file `1.list_files.txt`. This file lists the paths to each of the downloaded genome files. +- If you run `motus_genomes_download -m all`, an additional file `1.list_all_files.txt` will be created within `motus_v3_genomes`. + +Note that all files are first downloaded to `motus_v3_genomes/temp_dir` and moved to the final destination once the download and unzip are complete. + +The script automatically checks the md5 sum of each of the downloaded files. + + To download one genome type (for example the MAG `LIAN20-1_SAMN11649416_METAG_000035`): + ``` python motus_genomes_download -m LIAN20-1_SAMN11649416_METAG_000035 ``` -To download all genomes from one motu (for example the ref mOTU `ref_mOTU_v3_00006`): +To download all genomes from one mOTU (for example the ref mOTU `ref_mOTU_v3_00006`): ``` python motus_genomes_download -m ref_mOTU_v3_00006 ``` -To download all genomes: -``` -python motus_genomes_download -m all -``` - - - -The script automatically checks the md5 sum of the downloaded file. - -The data will be downloaded in the same folder where the script is located, under a folder with the name `motus_v3_genomes`. -The structure is as follows: -- Within `motus_v3_genomes` there is a folder per mOTU -- Within each mOTU folder there is a file `1.list_files.txt` with the path to the files, and the files of the genomes are within the mOTU directory -- If you run `motus_genomes_download -m all` an additional file `1.list_all_files.txt` will be created within `motus_v3_genomes`. - -Note that all files are first downloaded to `motus_v3_genomes/temp_dir` and moved to the final destination once the download and unzip is complete. - -If you run the two commands listed above to download a genome and mOTU, you will end up with the following structure: +If you run the two commands above to download a genome and mOTU, you will end up with the following structure: ``` . |-- README.md @@ -54,7 +54,12 @@ If you run the two commands listed above to download a genome and mOTU, you will `-- temp_dir ``` -Finally, you can list all motus and the number of associated genomes (available for download) with: +To download all genomes: +``` +python motus_genomes_download -m all +``` + +Finally, you can list all mOTUs and the number of associated genomes (available for download) with: ``` python motus_genomes_download -l ```