RaptorQuant

A pipeline developed for flawless downstream analysis of bulk RNAseq data. It is primarily designed for people with no major IT skills, as it requires only a few basic edits on the user’s part.

Based on your preferences, the pipeline can:

Analyze the quality of your data via FASTQC and MULTIQC.
Trim your data with Trim Galore!
Compute the quantity of transcripts present in your data with Salmon.
Create a large table containing results from Salmon for every sample and search for desired transcripts.

How to do it

Firstly clone the repository via git:

git clone https://github.com/KowalskiBio/RaptorQuant

1) Get your data

If you want to download the data: Provide URLs from ENA in the prepared file located in ./configs/links.txt (two examples are already present in the file). Please ensure that the data is in .gz format. Supposing you are in the RaptorQuant directory, run the following commands:
```
chmod +x Download.sh
./Download.sh
```
If you provide your own data: Your files need to be in .fastq.gz or .fq.gz format for downstream analysis. If trimming is required, place your data in ./data/raw_fastq. If no trimming is required, place your data in ./data/trimmed_fastq. The pipeline starts from this directory.

2) Run the pipeline

To run the pipeline, enter the following commands, assuming you are in the RaptorQuant directory:

  chmod +x ./scripts/*.sh
  chmod +x Run.sh
  ./Run.sh

Options in the workflow

You will be asked several questions during the process. The workflow changes depending on your answers. Below are the defined options:

Quality analysis with FASTQC/MULTIQC - Check the quality of your data. Results are stored in the ./output directory. FASTQC provides individual results, while MULTIQC provides combined results. Two rounds of quality checks are performed (if trimming is done), ensuring that the trimmed data is also assessed.
Trimming - Your raw data could be trimmed with Trim Galore! The pipeline is set up to do this automatically, if you select so. The trimmed data will be used for further analysis.
Salmon - You will be asked to choose between Ensembl and RefSeq data annotation, as these two differ. Both annotations are created with k=23. Information about the annotations used is in ./data/salmon/About.txt. Auxiliary information about each Salmon run is included in the individual result files.
Editing results – Salmon outputs individual sample results. To analyze your results and search for the transcript(s) of your choice, enter the exact transcript ID. Note:
- If you selected Ensembl as the annotation, enter the Ensembl transcript ID.
- If you selected RefSeq, enter the correct RefSeq transcript ID.
Additional information about specific annotation can be found here:
- https://www.ncbi.nlm.nih.gov/books/NBK50679/ for RefSeq
- https://www.ensembl.org/info/genome/genebuild/index.html for Ensembl.

Results

Generally, everything produced by this pipeline (except trimmed data) can be found in the ./output directory. Results from Salmon specifically are located in ./output/salmon_results. The results for each sample are stored individually. However, if the user chooses to analyze the data, which is the final step of the pipeline, and provides a transcript ID, two additional files are created during the workflow:

combined_results.tsv – A merged results file containing data from every sample processed in the run. The file is organized as a table with several columns, similar to the standard quant.sf file from Salmon. These columns include the name of the sample, transcript IDs, TPM (transcripts per million), and NumReads (number of reads).
filtered_results.tsv – A filtered results file displaying only the IDs provided by the user for clarity. This file has the same columns as combined_results.tsv.

An example of both files can be found in the ./output/salmon_results directory.

Happy quanting!

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
configs		configs
data		data
output		output
scripts		scripts
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
Download.sh		Download.sh
LICENSE		LICENSE
README.md		README.md
Run.sh		Run.sh
Transcriptome_analysis		Transcriptome_analysis
script.log		script.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RaptorQuant

How to do it

1) Get your data

2) Run the pipeline

Options in the workflow

Results

About

Releases

Packages

Languages

License

KowalskiBio/RaptorQuant

Folders and files

Latest commit

History

Repository files navigation

RaptorQuant

How to do it

1) Get your data

2) Run the pipeline

Options in the workflow

Results

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages