Skip to content

Latest commit

 

History

History
87 lines (75 loc) · 5.22 KB

File metadata and controls

87 lines (75 loc) · 5.22 KB

<<<<<<< HEAD

About this pipeline

This is a collection of shell, perl and R scripts to compute differential expression tables, starting from miRNA fastq files belonging to multiple samples
The files paths need to be adapted to download location, and the paths in all scripts need to be checked and adjusted before continuing.


Before running the scripts

  1. First have software as indicated in TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/MODULES installed, and their executables in path

  2. The pipeline needs miRNA gene expression table (computed by this pipeline), and mRNA DE results (obtained externally, listed below).

  3. Needs the following R libraries to be installed (see PACKAGES) library(biomaRt) library(ggplot2) library(DESeq2) library(multiMiR)

  4. VERY IMPORTANT: Please verify and change ALL input and output file and directory paths as necessary inside the scripts.


Data files needed for analysis
The compressed files need to be uncompressed before use.
The raw input fastq files are not included in the repository.
The script list3.sh is used to link the raw files to be used in the analysis.

  1. .project : Has some environment variables needed inside scripts
  2. ALL_FILE_LIST : contains the list of all sample files
  3. getRGs.txt : contains the read groups for all samples (generated by getRGs.sh)
  4. gencode.vM14.primary_assembly.annotation.gtf : reference gtf file (uncompress if this file is compressed)
  5. TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/MODULES : file containing the modules to be loaded before running the scripts
  6. TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/PACKAGES : contains the names of R packages to be loaded
  7. TRIM/MIRDEEP_MAPPER/mapping_file3.txt : Mapping file for mapping sample names in mirdeep2 result file to actual sample names
  8. TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/Samples.txt : File containing sample information
    The following data files are just names and not included in the repository
  9. TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/quantify_all_mature_novel/MATURE/miRNAs_expressed_all_samples_20180122084334.csv : the final miRNA expression file generated by mirDeep2 program
  10. mRNA_Results_from_Jim/significant_Fat_DMEM.csv : mRNA DE results for Fat vs DMEM (result from external analysis)
  11. mRNA_Results_from_Jim/significant_Fat_Lean.csv : mRNA DE results for Fat vs Lean (result from external analysis)
  12. mRNA_Results_from_Jim/significant_Lean_DMEM.csv : mRNA DE results for Lean vs DMEM (result from external analysis)

The following are script files to be run serially.
This will setup files, run quality check, trim reads, identifies miRNA, computes miRNA profiles, and finally DESeq analysis.
The files paths are relative to the base directory. Navigate to the corresponding directory to run the script.

  1. raw_data/file_links/list3.sh : Links raw data files to systematic filenames
  2. getRGs.sh : To assign read-groups for the samples, useful for SAM files
  3. mRNA_Results_from_Jim/pipeline.sh: To run mRNA DE analysis
  4. QC/QC_FASTQC/runFastQC.sh : Runs FastQC quality control
  5. TRIM/runTrimGalore.sh : Script used to trim the reads using TrimGalore
  6. TRIM/MIRDEEP_MAPPER/create_mirdeep_config.sh : Creates a config file for mirdeep run
  7. TRIM/MIRDEEP_MAPPER/runMirDeepMapper.sh : This script is used to create a mapping file, necessary for miRDeep2 software
  8. TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/runMirDeep2.sh : Main script used to detect miRNA
  9. TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/quantify_all_mature_novel/quantify_all_mature_novel.sh : This script is used to quantify the miRNA and writes the abundance tables
  10. TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/quantify_all_mature_novel/quantify_all_mature_novel.pl : perl script used by the script above
  11. TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/FUNCTIONS.R : Has the functions necessary for DESeq2
  12. TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/Travis_microRNA_analysis.R : This is the main DESeq2 analysis file (to be run in an R terminal)

The following are just names (not included in repository) of the result files generated by the pipeline


(p<0.05)
TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/MiRNA_FF_DM.0.05.csv TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/MiRNA_FF_LF.0.05.csv TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/MiRNA_LF_DM.0.05.csv
(p<0.1)
TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/MiRNA_FF_DM.0.1.csv TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/MiRNA_FF_LF.0.1.csv TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/MiRNA_LF_DM.0.1.csv
(ALL)
TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/MiRNA_FF_DM.1.csv TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/MiRNA_FF_LF.1.csv TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/MiRNA_LF_DM.1.csv
(LOG)
TRIM/MIRDEEP_MAPPER/MIRDEEP2_run3/DESEQINT/02-01-2018.16-44-24.log


=======

WV-INBRE-Bioinformatics-miRNA_differential_expression

miRNA analysis using miRDeep2 and DESeq2 in R

b79ff5c9a098ccde6c2a785a3f6d33c1593c477a