RNA sequence protocol for assessing Alternative Splicing

RNA sequence protocol for assessing Alternative Splicing

This repository contains a protocol to analyse RNA-seq data, focusing on alternative splicing & polyadenylation, authored by Oliver Ziff.
The contents are based on multiple resources including:
- RNAseq worksheet
- Biostars handbook
- rnaseq.wiki
- RNA-seqlopedia
- RNA Seq blog
- Bioconductor Course Materials
- Data Camp
- Coursera
- and most importantly the experience of established experts in RNAseq analysis within the Luscombe lab - my host laboratory.
- http://127.0.0.1:13884/library/rnaseqGene/doc/rnaseqGene.html RNA seq workflow
The protocol utilises a combination of bash unix commmand line and R scripts.
FAQs https://journals.plos.org/ploscompbiol/article/file?type=supplementary&id=info:doi/10.1371/journal.pcbi.1004393.s009
Tools: https://journals.plos.org/ploscompbiol/article/file?type=supplementary&id=info:doi/10.1371/journal.pcbi.1004393.s004

Chapters

RNA seq workflow
Wet-lab RNA sequencing phase
Accessing sequencing data
QC of sequencing files
Alignment
Visualisation in IGV browser
QE of aligned reads
Read quantification
Differential expression analysis
Splicing analysis
Gene enrichment analysis

RNA-seq Workflow

Introduction

The aim of RNA-seq is to interrogate relative transcript abundance and diversity. It's accuracy is superior to microarray and similar to qPCR

Steps of RNA-Seq:

Analysis goals:

transcript discovery
genome annotation
alternative expression analysis
gene fusion detection
viral detection
detect RNA editing (CRISP/Cas9)

Wet-lab sequencing phase:

Extract & isolate RNA
Prepare library: break RNA into small fragments, enrich nonribosomal RNA, convert to cDNA, construct fragment library (add sequencing adapters, PCR amplify)
High-throughput Sequence the cDNA library: generate single or paired end reads of 30-300bp in length. Flow cell, base calling & quality score, replicates (technical = multiple lanes in flow cell; biological = multiple samples from each condition)

Bioinformatic phase:

https://www.biostarhandbook.com/rnaseq/rnaseq-intro.html

Process raw Reads: FATQ files download SRA, quality scores (Phred), paired vs single end sequence, FASTQC quality control, variability, spike-ins, blocking & randomise, filter out low quality reads & artifacts (adapter sequence reads).
Align (map) reads to reference genome (FASTA, GFF, GTF): annotation file (BED), alignment program (STAR, HISAT), reference genomes (GenCODE, Ensemble), generate genome index, create & manipulate BAM/SAM files containing sequence alignment data
Visualise & explore alignment data in IGV and R studio: ggplot2, bias identification QoRTs,
Estimate Read Quantification (abundance) with gene based read counting
Compare abundances between conditions & replicates (differential expression): Normalise, adjust each gene read counts for the total aligned reads within each sample. Summarise data with pairwise correlation, hierarchical clustering, PCA analysis - look for differences between samples & identify outliers to consider excluding.

Requirements

On the CAMP cluscd ter most packages are preinstalled but to use them you need to use the module load function: ml STAR ml ncbi-vdb ml fastq-tools ml SAMtools ml RSeQC ml QoRTs ml multiqc ml Subread ml Java Use module spider to search for packages.

Install conda and activate bioconda

Installing packages in R install.package("package name") Bioconductor is a free software project for genomic analyses based on R programming. Install Bioconductor Source source ("https://bioconductor.org/biocLite.R") biocLite (“package_name“) biocLite("erccdashboard") # erccdashboard (for artificial spike in quantification) biocLite("DESeq")

Even though packages have been installed into R locally, then need to be brought into the working memory before using them: library("erccdashboard") library("DESeq")

Name		Name	Last commit message	Last commit date
Latest commit History 2,682 Commits
A. Wet-lab sequencing.md		A. Wet-lab sequencing.md
B. Sequencing data.md		B. Sequencing data.md
C. QC on sequencing files.md		C. QC on sequencing files.md
D. Alignment.md		D. Alignment.md
E. IGV.md		E. IGV.md
F. QC of Aligned Reads.md		F. QC of Aligned Reads.md
G. Read Quantification.md		G. Read Quantification.md
H. Differential Expression.md		H. Differential Expression.md
I. DE Visualisation.md		I. DE Visualisation.md
I. Splicing Analysis.md		I. Splicing Analysis.md
I. Visualise Differential Expression.md		I. Visualise Differential Expression.md
J. Gene Enrichment.md		J. Gene Enrichment.md
Kallisto.md		Kallisto.md
PE_stranded_BedGraph.sh		PE_stranded_BedGraph.sh
QoRTs>DESeq2.R		QoRTs>DESeq2.R
README.md		README.md
Snakefile		Snakefile
aligning		aligning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RNA sequence protocol for assessing Alternative Splicing

Contents

Chapters

RNA-seq Workflow

Introduction

Wet-lab sequencing phase:

Bioinformatic phase:

Requirements

About

Releases

Packages

Languages

octup/oz-bulk-rnaseq

Folders and files

Latest commit

History

Repository files navigation

RNA sequence protocol for assessing Alternative Splicing

Contents

Chapters

RNA-seq Workflow

Introduction

Wet-lab sequencing phase:

Bioinformatic phase:

Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages