gginteract (Pairend reads interaction visualization with customizable ggplot object) is an in-house visualization tool for pairend reads interaction visualization. These "pairend reads" can indirectly suggest the molecules interaction with next generation high throughput RNA-seq method (methods include MARGI, MARIO etc). This package is specifically designed to present two molecules interaction in two styles. These plots trying to manifest Read1 and Read2 mapping details, as well as the target gene annotations in a condensed way.
The whole figure is achieved by aggregating the building blocks generated by gginteract packages, they are:
- ideograms
- gene annotaion tracks
- parallel pairend skeleton
- pairend horizontal skeleton
- text and marks
Existing solutions (gviz, ggbio) for merging multi-facet genome annotation information (eg, ideogram or transcript annotation) usually using a track stack strategy, where each object occupys a track and different tracks stacked vertically share the same horizontal coordinates. However, these tools fail to offer a flexible way to manipulate details of the tracks. For example it could be difficult to:
- Draw ideograms at random positions while control their width and height.
- Add other customizable information on existing track sharing the same coordinate. (eg. a zoom-in view, a gene name list of a gene dense region highlighted on an ideogram).
gginteract builds up every object on top of ggplot2 buildings to ensure that every building blocks of the plot is a ggplot layer (or layer lists), which means you could add any other layers on top of the skeleton exported from gginteract package.
To ensure users could easily aquire the accurate position of the skeleton, each skeleton object is an S4 class instance with location details record in slots.
Install ggplot R pakcage from github.
$ R
> library(devtools)
> install_github("irenexzwen/gginteract")
To generate interaction plot, at least four inputs are required:
first two are gene annotation file, first three columns are mandatory (chr name, exon start, exon end). :
GENE1_anno
Dataframe, a bed file format dataframe for gene1. each row represents an exon with chr, start, end.GENE2_anno
Dataframe, a bed file format dataframe for gene2. each row represents an exon with chr, start, end.
Next, reads information in bed format. first four columns are mandatory (chr name, read start, read end, read name). :
Read1
Dataframe, a bed file format dataframe for Read1 (first end) bed file.Read2
Dataframe, a bed file format dataframe for Read2 (second end) bed file.
Example data could be checked with:
$R
> library(gginteract)
> data("R1","R2","GENE1_anno","GENE2_anno")
To generate a parallel interaction plot, you could simply use one function:
$R
> data("R1","R2","GENE1_anno","GENE2_anno")
> para <- gginteract::parallel_plot(GENE1_anno = GENE1_anno,
GENE2_anno = GENE2_anno,
R1 = R1,
R2 = R2,
genename1 = "DDX23",
genename2 = "RPS7",
GENE1_COLOR="#deb210",
GENE2_COLOR="#668ed1")
> para
To generate a parallel interaction plot, you could simply use one function:
$R
> data("R1","R2","GENE1_anno","GENE2_anno")
> pair <- gginteract::pairend_plot(GENE1_anno = GENE1_anno,
GENE2_anno = GENE2_anno,
R1 = R1,
R2 = R2,
genename1 = "DDX23",
genename2 = "RPS7")
> pair
😐 Notice! We suggest use pairend interact plot fashion when the reads pair less than < 40 to avoid pixel collision.
If you have more than 40 read pairs to show, please switch to the parallel style.
The above functions are highly integrated and easy to use. However, the following details of each building blocks might be helpful if you want to create your own customized plot.
In gginteract you could create an ideogram ggplot object using the following code. You need to specify the genome you're using and sub chromosome you're looking for. This function rewrite the ideogram object in ggbio).
$R
> ?create_ideo # the function description
> ideo <- create_ideo(genome = "hg38",
chr = "chr4",
ideo.width = 400)
> ggplot() + ideo@geom_ideobody
create_ideo()
also allows you to move the ideogram object from the (0,0) with parameters ydrift
and xdrift
. You could also change the width and height ratio by reset whratio=
. For example, you could make up a "Bagua map" or a polarized "Bagua map" using the following code.
$R
> width_list <- c(400,150,150,400)
> height <- 25
> xdrifts <- c(0, 0, 250,0)
> ydrifts <- c(90, 60, 60, 30)
> geom <- list()
> for(i in 1:4){
chrn <- paste0("chr",i)
k <- create_ideo(chr = chrn,
ideo.width = width_list[i],
ideo.height = height,
xdrift = xdrifts[i],
ydrift = ydrifts[i])
> geom <- c(geom,k@geom_ideobody)}
> ggplot()+geom
> ggplot()+geom+coord_polar()
gginteract provide a simple function to highlight a region of interest on an ideogram.
$R
# ideo highlight
> i <- create_ideo(chr="chr4",ideo.width = 200,ideo.height = 10)
> i@chr_end #[1] 190214555
> i <- ideo_add_highlight(i,region = c(80000000,90000000))
> ggplot() + i@geom_ideobody + i@geom_tick # use this code to show the single object.
An ideo
object i now has the following slots of features:
The skeleton of parallel style interaction plot could be generate with the parallel_inter
function:
$R
> k <- parallel_inter(GENE1_anno = GENE1_anno,
GENE2_anno = GENE2_anno,
R1 = R1,
R2 = R2)
> ggplot()+k@geom_para
parallel_inter
will create a para
class instance. Often use geom_para
slot to show the plot. Other slots of object k include:
- k@genetop / k@genebot: gene_anno class object to store meta information about the two genes.
> str(k@genetop)
Formal class 'gene_anno' [package "gginteract"] with 8 slots
..@ name : chr ""
..@ chr_num : chr "2"
..@ chr : chr "chr2"
..@ chromstart: int 3575262
..@ chromend : int 3580919
..@ genelen : int 5657
..@ center : num 2828
..@ anno :'data.frame': 7 obs. of 7 variables:
.. ..$ chr : chr [1:7] "chr2" "chr2" "chr2" "chr2" ...
.. ..$ start : int [1:7] 3575262 3575591 3575816 3576486 3577709 3580109 3580804
.. ..$ end : int [1:7] 3575350 3575684 3575888 3576630 3577774 3580260 3580919
.. ..$ NA : chr [1:7] "exon_0" "exon_1" "exon_2" "exon_3" ...
.. ..$ NA.1 : chr [1:7] "+" "+" "+" "+" ...
.. ..$ yvalue: num [1:7] 0 0 0 0 0 0 0
.. ..$ height: num [1:7] 374 374 374 374 374 ...
- coordinates information
The skeleton of pairend style interaction plot could be generate with the pairend_inter
function:
$R
pair <- pairend_plot(GENE1_anno = GENE1_anno,
GENE2_anno = GENE2_anno,
R1 = R1,
R2 = R2)
pair
This pairend skeleton object follows the same logic as para object. More details could be access using str(pair)
.