scripts/summariseMAFs.Rmd

---
title: "MAF files summary"
author: "UMCCR"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
  html_document:
    theme: readable
    css: summariseMAFs.css
    toc: true
    toc_float: true
    code_folding: hide
  rmdformats::material:
    highlight: kate
params:
  maf_dir: '/Users/kanwals/UMCCR/research/PAAD_atlas/maf_analysis/data/'
  maf_files: 'SBJ-somatic-PASS.maf'
  datasets: 'pdac_test'
  samples_id_cols: NULL
  genes_min: '2'
  genes_list: 'none'
  genes_keep_order: TRUE
  genes_blacklist: 'none'
  samples_show: FALSE
  samples_keep_order: TRUE
  samples_keep_order_annot: TRUE
  sort_by_annotation: FALSE
  samples_list: 'none'
  samples_blacklist: 'none'
  nonSyn_list: 'Frame_Shift_Del,Frame_Shift_Ins,Splice_Site,Translation_Start_Site,Nonsense_Mutation,Nonstop_Mutation,In_Frame_Del,In_Frame_Ins,Missense_Mutation'
  remove_duplicated_variants: TRUE
  pathways: 'none'
  purple: 'none'
  purple_hd: 0.5
  purple_loh: 1.5
  purple_amp: 6
  cnvkit: 'none'
  cnvkit_hd: 0.5
  cnvkit_loh: 1.5
  cnvkit_amp: 6
  gistic: 'none'
  draw_titv: FALSE
  clinical_info: 'none'
  clinical_features: 'none'
  clinical_enrichment_p: 0.05
  signature_enrichment_p: 0.05
  maf_comp_p: 0.05
  maf_comp_fdr: 1
  out_folder: 'MAF_summary_report'
  hide_code_btn: TRUE
  ucsc_genome_assembly: 38
---

Report summarising and visualising mutation data in [Mutation Annotation Format](https://software.broadinstitute.org/software/igv/MutationAnnotationFormat){target="_blank"} (MAF) file(s) for the following dataset(s): **`r gsub(",", ", ", params$datasets) `**

***

<span style="color:#ff0000">NOTE</span>: this report summarises **only non-synonymous variants** (see definitions below).

<details>
<summary>Variants consequence definitions</summary>
<font size="2">

**Non-synonymous variants** are defined as variants with the following consequences: *`r paste(params$nonSyn_list, collapse = ", ")`*. Rest will be considered as silent variants.

NOTE: by default, variants considered as non-synonymous are those with high/moderate variant consequences, which include: *frame shift deletions*, *frame shift insertions*, *splice site mutations*, *translation start site mutations*, *nonsense mutation*, *nonstop mutations*, *in-frame deletion*, *in-frame insertions* and *missense mutation*.

* [*High impact variant consequence*](http://asia.ensembl.org/Help/Glossary?id=535){target="_blank"} -	the variant is assumed to have high (disruptive) impact in the protein, probably causing protein truncation, loss of function or triggering nonsense mediated decay.

* [*Moderate impact variant consequence*](http://asia.ensembl.org/Help/Glossary?id=535){target="_blank"} -	a non-disruptive variant that might change protein effectiveness.

* [*Low impact variant consequence*](http://asia.ensembl.org/Help/Glossary?id=535){target="_blank"} -	a variant that is assumed to be mostly harmless or unlikely to change protein behaviour.

</font> 
</details>

***

<details>
<summary>Input parameters</summary>
<font size="2">

* **maf_dir**: `r params$maf_dir`
* **maf_files**: `r params$maf_files`
* **datasets**: `r params$datasets`
* **samples_id_cols**: `r params$samples_id_cols`
* **genes_min**: `r params$genes_min`
* **genes_list**: `r params$genes_list`
* **genes_keep_order**: `r params$genes_keep_order`
* **genes_blacklist**: `r params$genes_blacklist`
* **samples_show**: `r params$samples_show`
* **samples_keep_order**: `r params$samples_keep_order`
* **samples_keep_order_annot**: `r params$samples_keep_order_annot`
* **samples_list**: `r params$samples_list`
* **samples_blacklist**: `r params$samples_blacklist`
* **nonSyn_list**: `r params$nonSyn_list`
* **remove_duplicated_variants**: `r params$remove_duplicated_variants`
* **pathways**: `r params$pathways`
* **purple**: `r params$purple`
`r if ( params$purple !="none" ) { c(paste0("* **purple_hd**: ",params$purple_hd)) }`
`r if ( params$purple !="none" ) { c(paste0("* **purple_loh**: ",params$purple_loh)) }`
`r if ( params$purple !="none" ) { c(paste0("* **purple_amp**: ",params$purple_amp)) }`
* **cnvkit**: `r params$cnvkit`
`r if ( params$cnvkit !="none" ) { c(paste0("* **cnvkit_hd**: ",params$cnvkit_hd)) }`
`r if ( params$cnvkit !="none" ) { c(paste0("* **cnvkit_loh**: ",params$cnvkit_loh)) }`
`r if ( params$cnvkit !="none" ) { c(paste0("* **cnvkit_amp**: ",params$cnvkit_amp)) }`
* **gistic**: `r params$gistic`
* **draw_titv**: `r params$draw_titv`
* **clinical_info**: `r params$clinical_info`
* **clinical_features**: `r params$clinical_features`
* **clinical_enrichment_p**: `r params$clinical_enrichment_p`
* **signature_enrichment_p**: `r params$signature_enrichment_p`
* **maf_comp_p**: `r params$maf_comp_p`
* **maf_comp_fdr**: `r params$maf_comp_fdr`
* **out_folder**: `r params$out_folder`
* **hide_code_btn**: `r params$hide_code_btn`
* **ucsc_genome_assembly**: `r params$ucsc_genome_assembly`

</font> 
</details>

***

```{r code_display, echo = FALSE}
##### Include or exclude the "Code" buttom allowing to "show"/"hide" code chunks from the report 
if ( params$hide_code_btn ) {
  writeLines(".btn { display: none ;", con = "summariseMAFs.css")
} else {
  writeLines(" ", con = "summariseMAFs.css")
}
```

```{r define_functions, comment=NA, message=FALSE, warning=FALSE}
##### Define functions

##### Create 'not in' operator
"%!in%" <- function(x,table) match(x,table, nomatch = 0) == 0

##### Assign colours to different datasets. These colours will be used to distinguish tabs in generated excel summary spreadsheets
getDatasetsColours <- function(datasets) {
  
  ##### Predefined selection of colours for datasets
  datasets.colours <- c("dodgerblue","firebrick","lightslategrey","darkseagreen","orange","darkcyan","bisque", "coral2", "cadetblue3","red","blue","green")
  
  f.datasets <- factor(datasets)
  vec.datasets <- datasets.colours[1:length(levels(f.datasets))]
  datasets.colour <- rep(0,length(f.datasets))
  for(i in 1:length(f.datasets))
    datasets.colour[i] <- vec.datasets[ f.datasets[i]==levels(f.datasets)]
  
  return( list(vec.datasets, datasets.colour) )
}

###### Generate dataTable for each dataset with all mutation information for selected gene(s), as provided in MAF files. User can filter variants to include only non-synonymous (default), silent or all variants
mut.details.datasets <- function(mafInfo, datasets, genes, type = "nonsynonymous") {
  
  ##### Vector with datasets with no mutations reported in selected genes
  datasets.noMut <- NULL
  
  ##### Create a list for htmlwidgets
  widges.list <- htmltools::tagList()
  
  for ( i in 1:length(datasets) ) {
    
    ##### Include all variants
    if ( type == "all" ) {
      
        mut.details.genes <- mafInfo[[datasets[i]]]@data[ mafInfo[[datasets[i]]]@data[, Hugo_Symbol] %in% genes, ]
        mut.details.genes <- rbind(mut.details.genes, mafInfo[[datasets[i]]]@maf.silent[ mafInfo[[datasets[i]]]@maf.silent[, Hugo_Symbol] %in% genes, ] )
        
    ##### Include silent variants
    } else if ( type == "silent" ) {
      
        mut.details.genes <- mafInfo[[datasets[i]]]@maf.silent[ mafInfo[[datasets[i]]]@maf.silent[, Hugo_Symbol] %in% genes, ]
    
    ##### Include only non-synonymous variants
    } else {
        mut.details.genes <- mafInfo[[datasets[i]]]@data[ mafInfo[[datasets[i]]]@data[, Hugo_Symbol] %in% genes, ]
    }
    
    if ( nrow(mut.details.genes) != 0 ) {
      
      ##### Sort table by gene symbol and then by sample ID
      mut.details.genes <- mut.details.genes[ order(mut.details.genes$Hugo_Symbol, mut.details.genes$Tumor_Sample_Barcode), ]
      
      #### Move column with Hugo_Symbol to the first place and Tumor_Sample_Barcode to the second
      col_idx <- grep("Tumor_Sample_Barcode", names(mut.details.genes))
      mut.details.genes <- as.data.frame(mut.details.genes)[, c(col_idx, (1:ncol(mut.details.genes))[-col_idx]) ]
      
      col_idx <- grep("Hugo_Symbol", names(mut.details.genes))
      mut.details.genes <- as.data.frame(mut.details.genes)[, c(col_idx, (1:ncol(mut.details.genes))[-col_idx]) ]
      
      widges.list[[i]] <- DT::datatable( data = mut.details.genes, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets[i])), filter = "top", extensions = c('Buttons','FixedColumns','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, fixedColumns = list(leftColumns = 2), deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
        DT::formatStyle( columns = names(mut.details.genes), 'text-align' = 'center' )
      
    } else {
      datasets.noMut <- c(datasets.noMut, datasets[i])
    }
  }
  
  ##### Report datasets with no mutations reported in selected genes
  if ( length(datasets.noMut) != 0 ) {
    if ( type == "nonsynonymous" ) {
      cat(paste("<span style=\"color:#ff0000\">NOTE</span>: none of queried gene(s) have non-synonymous variants reported in the following dataset(s):", paste(datasets.noMut, collapse = ", "), "\n\n", sep=" "))
    } else if ( type == "silent" ) {
      cat(paste("NOTE, none of queried gene(s) have silent variants reported in the following dataset(s):", paste(datasets.noMut, collapse = ", "), "\n\n", sep=" "))
    }
  }
  
  ##### Print a list of htmlwidgets
  widges.list
}

###### Generate lollipop plot for each dataset for selected gene
lollipops.datasets <- function(mafInfo, datasets, gene) {
  
  ##### Create a list to store MAF info for individual datasets
  for ( dataset in datasets ) {
    
    mut = subsetMaf(maf = mafInfo[[dataset]], includeSyn = FALSE, genes = gene, mafObj = FALSE, query = "Variant_Type != 'CNV'", dropLevels = FALSE)
    
    ##### Check if the gene has any mutations in correspoding dataset
    if ( nrow(mut) != 0 ) {
      
      ##### Drawing lollipop for the top 10 genes in each dataset
      ##### Check if the amino acid changes information is available in MAF provided files. The script expects column called "HGVSp_Short", which is produced with vcf2maf (https://github.com/mskcc/vcf2maf) when converting VCFs to MAFs (https://github.com/cBioPortal/cbioportal/issues/2996) and describes a mutation's amino acid change. The "aa_mutation" field used for annotation in ICGC samples is also acceptable. NOTE: other possibilities are: "Protein_Change", "AAChange""
      pchange = c('HGVSp_Short', 'Protein_Change', 'AAChange')
      
      ##### Define the column with protein change info
      pchange = pchange[pchange %in% colnames(mut)]
      
      ##### Check if the protein change field is not empty
      if ( any(!is.na(as.data.frame(mut)[ , pchange  ]))  ) {
        
        cat(paste("\n\n <b>", dataset, "</b> \n\n", sep=" "))
        
        ##### Check if non-synonymous variats are detected
        if ( gene %in% mut$Hugo_Symbol ) {
        
          ##### Make it plot to a dummy graphics device file (e.g. /dev/null) to avoid plotting to the console
          pdf(file=paste0(mutationMapsDir, "/", paste(dataset, gene, sep="_"), ".pdf"), width = 8, height = 5)
          lollipopPlot.image <- capture.output(maftools::lollipopPlot(maf = mafInfo[[dataset]], gene = gene, AACol = pchange, printCount = FALSE, showDomainLabel = FALSE, repel = FALSE, labelPos = "all" , showMutationRate = TRUE, cBioPortal = TRUE))
          invisible(dev.off())
          
          ##### Export pdf to png
          lollipopPlot.image <- image_read_pdf(paste(mutationMapsDir, "/", paste(dataset, gene, sep="_"), ".pdf", sep = ""), pages = NULL, density = 300)
          image_write(lollipopPlot.image, path = paste(mutationMapsDir, "/", paste(dataset, gene, sep="_"), ".png", sep = ""), format = "png")
          
          ##### Read in the PNG files
          cat("![](",paste(paste0(mutationMapsDir, "/", paste(dataset, gene, sep="_")), ".png", sep = ""),")")
          cat("<br/><br/><br/>")
          
          ##### Remove redundant pdf plot
          file.remove(paste(mutationMapsDir, "/", paste(dataset, gene, sep="_"), ".pdf", sep = ""))
          while (!is.null(dev.list()))  invisible(dev.off())
        } else {
          cat(paste("**", gene, " have no non-synonymous variants detected in ", dataset, "dataset**.\n\n", sep=" "))
          cat("\n***\n")
        }
        
      ##### ...otherwise leave a message
      } else {
        
        ##### Check if the genes has any synonymous vatiants
        if ( length(pchange[pchange %in% colnames(mafInfo[[dataset]]@maf.silent)]) > 0 && gene %in% mafInfo[[dataset]]@maf.silent$Hugo_Symbol && any(!is.na(as.data.frame(mafInfo[[dataset]]@maf.silent)[ , pchange  ])) ) {
        
          cat(paste("This section was skipped for dataset", dataset, "since only synonymous variants were detected in", gene, "gene.\n\n", sep=" "))
      
        } else {
          cat(paste("This section was skipped for dataset", dataset, "since the corresponding MAF does not contain field with amino acid changes details!\n\n", sep=" "))
        }
      }
        
    } else {
      cat(paste("\n\n <b>", dataset, "</b> \n\n", sep=" "))
      cat(paste("Gene <i>", gene, "</i> has no mutations reported in dataset", dataset, "\n\n", sep=" "))
    }
  }
}

##### A wrapper to saveWidget which compensates for arguable BUG in saveWidget which requires `file` to be in current working directory (see post https://github.com/ramnathv/htmlwidgets/issues/299 )
saveWidgetFix <- function ( widget, file, ...) {
  wd<-getwd()
  on.exit(setwd(wd))
  outDir<-dirname(file)
  file<-basename(file)
  setwd(outDir);
  htmlwidgets::saveWidget(widget,file=file,...)
}

##### Function for suppressing output from in-function printed message
quiet <- function(x) { 
  sink(tempfile()) 
  on.exit(sink()) 
  invisible(force(x)) 
}

##### Function to create oncomatrix that is also used for the oncoplot function
createOncoMatrix = function(m, g = NULL, chatty = TRUE, add_missing = FALSE){

  if(is.null(g)){
    stop("Please provde atleast two genes!")
  }

  subMaf = subsetMaf(maf = m, genes = g, includeSyn = FALSE, mafObj = FALSE, dropLevels = FALSE)

  if(nrow(subMaf) == 0){
    if(add_missing){
      numericMatrix = matrix(data = 0, nrow = length(g), ncol = length(levels(getSampleSummary(x = m)[,Tumor_Sample_Barcode])))
      rownames(numericMatrix) = g
      colnames(numericMatrix) = levels(getSampleSummary(x = m)[,Tumor_Sample_Barcode])

      oncoMatrix = matrix(data = "", nrow = length(g), ncol = length(levels(getSampleSummary(x = m)[,Tumor_Sample_Barcode])))
      rownames(oncoMatrix) = g
      colnames(oncoMatrix) = levels(getSampleSummary(x = m)[,Tumor_Sample_Barcode])

      vc = c("")
      names(vc) = 0

      return(list(oncoMatrix = oncoMatrix, numericMatrix = numericMatrix, vc = vc))
    }else{
      return(NULL)
    }
  }

  if(add_missing){
    subMaf[, Hugo_Symbol := factor(x = Hugo_Symbol, levels = g)]
  }

  oncomat = data.table::dcast(data = subMaf[,.(Hugo_Symbol, Variant_Classification, Tumor_Sample_Barcode)], formula = Hugo_Symbol ~ Tumor_Sample_Barcode,
                              fun.aggregate = function(x){
                                x = unique(as.character(x))
                                xad = x[x %in% c('Amp', 'Del')]
                                xvc = x[!x %in% c('Amp', 'Del')]

                                if(length(xvc)>0){
                                  xvc = ifelse(test = length(xvc) > 1, yes = 'Multi_Hit', no = xvc)
                                }

                                x = ifelse(test = length(xad) > 0, yes = paste(xad, xvc, sep = ';'), no = xvc)
                                x = gsub(pattern = ';$', replacement = '', x = x)
                                x = gsub(pattern = '^;', replacement = '', x = x)
                                return(x)
                              } , value.var = 'Variant_Classification', fill = '', drop = FALSE)

  #convert to matrix
  data.table::setDF(oncomat)
  rownames(oncomat) = oncomat$Hugo_Symbol
  oncomat = as.matrix(oncomat[,-1, drop = FALSE])

  variant.classes = as.character(unique(subMaf[,Variant_Classification]))
  variant.classes = c('',variant.classes, 'Multi_Hit')
  names(variant.classes) = 0:(length(variant.classes)-1)

  #Complex variant classes will be assigned a single integer.
  vc.onc = unique(unlist(apply(oncomat, 2, unique)))
  vc.onc = vc.onc[!vc.onc %in% names(variant.classes)]
  names(vc.onc) = rep(as.character(as.numeric(names(variant.classes)[length(variant.classes)])+1), length(vc.onc))
  variant.classes2 = c(variant.classes, vc.onc)

  oncomat.copy <- oncomat
  #Make a numeric coded matrix
  for(i in 1:length(variant.classes2)){
    oncomat[oncomat == variant.classes2[i]] = names(variant.classes2)[i]
  }

  #If maf has only one gene
  if(nrow(oncomat) == 1){
    mdf  = t(matrix(as.numeric(oncomat)))
    rownames(mdf) = rownames(oncomat)
    colnames(mdf) = colnames(oncomat)
    return(list(oncoMatrix = oncomat.copy, numericMatrix = mdf, vc = variant.classes))
  }

  #convert from character to numeric
  mdf = as.matrix(apply(oncomat, 2, function(x) as.numeric(as.character(x))))
  rownames(mdf) = rownames(oncomat.copy)


  #If MAF file contains a single sample, simple sorting is enuf.
  if(ncol(mdf) == 1){
    sampleId = colnames(mdf)
    mdf = as.matrix(mdf[order(mdf, decreasing = TRUE),])
    colnames(mdf) = sampleId

    oncomat.copy = as.matrix(oncomat.copy[rownames(mdf),])
    colnames(oncomat.copy) = sampleId

    return(list(oncoMatrix = oncomat.copy, numericMatrix = mdf, vc = variant.classes))
  } else{
    #Sort by rows as well columns if >1 samples present in MAF
    #Add total variants per gene
    mdf = cbind(mdf, variants = apply(mdf, 1, function(x) {
      length(x[x != "0"])
    }))
    #Sort by total variants
    mdf = mdf[order(mdf[, ncol(mdf)], decreasing = TRUE), ]
    #colnames(mdf) = gsub(pattern = "^X", replacement = "", colnames(mdf))
    nMut = mdf[, ncol(mdf)]

    mdf = mdf[, -ncol(mdf)]

    mdf.temp.copy = mdf #temp copy of original unsorted numeric coded matrix

    mdf[mdf != 0] = 1 #replacing all non-zero integers with 1 improves sorting (& grouping)
    tmdf = t(mdf) #transposematrix
    mdf = t(tmdf[do.call(order, c(as.list(as.data.frame(tmdf)), decreasing = TRUE)), ]) #sort

    mdf.temp.copy = mdf.temp.copy[rownames(mdf),] #organise original matrix into sorted matrix
    mdf.temp.copy = mdf.temp.copy[,colnames(mdf)]
    mdf = mdf.temp.copy

    #organise original character matrix into sorted matrix
    oncomat.copy <- oncomat.copy[,colnames(mdf)]
    oncomat.copy <- oncomat.copy[rownames(mdf),]

    return(list(oncoMatrix = oncomat.copy, numericMatrix = mdf, vc = variant.classes))
  }
}
```

```{r load_libraries, warning=FALSE}
suppressMessages(library(knitr))
suppressMessages(library(maftools))
suppressMessages(library(pheatmap))
suppressMessages(library(NMF))
suppressMessages(library(openxlsx))
suppressMessages(library(ggplot2))
suppressMessages(library(DT))
suppressMessages(library(purrr))
suppressMessages(library(tidyverse))
suppressMessages(library(magick))
suppressMessages(library(htmltools))
suppressMessages(library(htmlwidgets))
suppressMessages(library(package=paste0("BSgenome.Hsapiens.UCSC.hg", params$ucsc_genome_assembly), character.only = TRUE))
```

```{r seed}
##### Set the seed
seed <- sample(0:99999999, 1, replace = TRUE)
set.seed(seed)
```

## Datasets

```{r datasets, comment = NA, message=FALSE, warning=FALSE}
##### Present patient cohorts to be summarised
##### Split the string of MAF files and put them into a vector
mafFiles <- unlist(strsplit(params$maf_files, split=',', fixed=TRUE))
mafFiles <- paste(params$maf_dir, mafFiles, sep="/")

##### Split the string of datasets names and put them into a vector
datasets.list <- unlist(strsplit(params$datasets, split=',', fixed=TRUE))

datasets.df <- as.data.frame( cbind(datasets.list, unlist(strsplit(params$maf_files, split=',', fixed=TRUE))) )
names(datasets.df) <- c("Dataset", "MAF file")

DT::datatable( data = datasets.df, filter = "none", extensions = 'Buttons', options = list(pageLength = length(mafFiles), dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy')) ) %>%
        DT::formatStyle( columns = names(datasets.df), 'text-align' = 'center' )
```

```{r load_data, comment = NA, message=FALSE, warning=FALSE, results='hide'}
##### Check if list of genes of interest is provided
if ( params$genes_list != "none" ) {
  goi_status <- TRUE
} else {
  goi_status <- FALSE
}

##### Check if list of pathways is provided
if ( params$pathways != "none" ) {
  pathways_status <- TRUE
} else {
  pathways_status <- FALSE
}

##### Read MAF files and put associated info into a list
##### Create a list to store MAF info for individual datasets
mafInfo <- vector("list", length(mafFiles))
names(mafInfo) <- datasets.list

mafInfo_tcgaCompare <- vector("list", length(mafFiles))
names(mafInfo_tcgaCompare) <- datasets.list

##### Check if file with clinical information is provided
clinicalInfo <- vector("list", length(mafFiles))
names(clinicalInfo) <- datasets.list

clinicalFeatures <- vector("list", length(mafFiles))
names(clinicalFeatures) <- datasets.list

if ( params$clinical_info != "none" ){
  clinicalFiles <- unlist(strsplit(params$clinical_info, split=',', fixed=TRUE))
  
  for ( i in 1:length(mafFiles) ) {
    clinicalInfo[[i]] <- read.table(clinicalFiles[[i]], sep="\t", as.is=TRUE, header=TRUE, row.names=NULL, quote = "")
    
    ##### Define columns to be drawn in the oncoplot(s)
    if ( params$clinical_features != "none" ){
      clinicalFeatures[[i]] <- unlist(strsplit(params$clinical_features, split=',', fixed=TRUE))
      
      ##### Keep only those that are acrually present in provided clinical information file
      clinicalFeatures[[i]] <- clinicalFeatures[[i]][ clinicalFeatures[[i]] %in% names(clinicalInfo[[i]]) ]
    } else {
      clinicalFeatures[[i]] <- ""
    }
  }
} else {
  for ( i in 1:length(mafFiles) ) {
    clinicalInfo[[i]] <- NA
  }
} 

##### NOTE: maftools by default summarises only non-synonymous variants with high/moderate variant consequences and ignores silent variants (https://github.com/PoisonAlien/maftools/issues/63), which are stored in "maf.silent" slot of the class MAF object (mafInfo[[i]]@maf.silent)
for ( i in 1:length(mafFiles) ) {
  
  ##### Add clinical information if provided
  if ( any(!is.na(clinicalInfo[[i]])) ){
    mafInfo[[i]] <- maftools::read.maf(maf = mafFiles[i], vc_nonSyn = unlist(strsplit(params$nonSyn_list, split=',', fixed=TRUE)), removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, clinicalData = NULL)
    
  ##### Make sure that the input MAF doesn't contain empty rows, which would throw errors in downstream analyses
  ##### List all samples and remove empty sample names
  tsb <- as.vector(unlist(unique(mafInfo[[i]]@maf.silent[, "Tumor_Sample_Barcode"])))
  tsb <- tsb[ tsb != "" ]
  
  ##### Subset the maf object to include only non-empty samples
  maf.data <- subsetMaf(maf = mafInfo[[i]], tsb = tsb, genes = NULL, fields = NULL, query = NULL, mafObj = FALSE, includeSyn = TRUE, dropLevels=TRUE)
  
  ##### Convert the MAF data into Maf object
  mafInfo[[i]] <- maftools::read.maf(maf = maf.data, vc_nonSyn = unlist(strsplit(params$nonSyn_list, split=',', fixed=TRUE)), removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, clinicalData = NULL)

  ##### Check if for any sample the clincal info is missing and add "NA"" to these samples
    if ( length(unique(mafInfo[[i]]@data$Tumor_Sample_Barcode)[ unique(mafInfo[[i]]@data$Tumor_Sample_Barcode) %!in% clinicalInfo[[i]]$Tumor_Sample_Barcode]) > 0 ) {
      
      clinicalInfo.missing <- unique(mafInfo[[1]]@data$Tumor_Sample_Barcode)[ unique(mafInfo[[1]]@data$Tumor_Sample_Barcode) %!in% clinicalInfo[[i]]$Tumor_Sample_Barcode]
      
      ##### Identify samples with missing info and add "NA"" to these samples
      clinicalInfo.missing <- data.frame(cbind(as.character(clinicalInfo.missing)), rep(NA, length(clinicalInfo.missing)))
      names(clinicalInfo.missing) <- names(clinicalInfo[[i]])
      clinicalInfo[[i]] <- rbind(clinicalInfo[[i]], clinicalInfo.missing)
    }
    
    #### Convert numeric values to characters and make sure the clinical features names are correct 
    for (j in 1:length(clinicalFeatures[[i]])) {
      clinicalInfo[[i]][, clinicalFeatures[[i]][j]] <- as.character(clinicalInfo[[i]][, clinicalFeatures[[i]][j]])
      #clinicalInfo[[i]][, clinicalFeatures[[i]][j]] <- make.names(clinicalInfo[[i]][, clinicalFeatures[[i]][j]])
    }
  
    mafInfo[[i]] <- maftools::read.maf(maf = mafFiles[i], vc_nonSyn = unlist(strsplit(params$nonSyn_list, split=',', fixed=TRUE)), removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, clinicalData = clinicalInfo[[i]])
  
  } else {
    mafInfo[[i]] <- maftools::read.maf(maf = mafFiles[i], vc_nonSyn = unlist(strsplit(params$nonSyn_list, split=',', fixed=TRUE)), removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, clinicalData = NULL)
    
  ##### Make sure that the input MAF doesn't contain empty rows, which would throw errors in downstream analyses
  ##### List all samples and remove empty sample names
  #tsb <- as.vector(unlist(unique(mafInfo[[i]]@maf.silent[, "Tumor_Sample_Barcode"])))
  #tsb <- tsb[ tsb != "" ]
  
  ##### Subset the maf object to include only non-empty samples
  #maf.data <- subsetMaf(maf = mafInfo[[i]], tsb = tsb, genes = NULL, fields = NULL, query = NULL, mafObj = FALSE, includeSyn = TRUE, dropLevels=TRUE)
  
  ##### Convert the MAF data into Maf object
  #mafInfo[[i]] <- maftools::read.maf(maf = maf.data, vc_nonSyn = unlist(strsplit(params$nonSyn_list, split=',', fixed=TRUE)), removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, clinicalData = NULL)
  }
}

##### Collect the number indiciating minimal percentage of patients carrying mutations in individual genes to be included in the report
##### Create a list to store MAF info for individual datasets
genes_min <- vector("list", length(mafFiles))
names(genes_min) <- datasets.list

for ( i in 1:length(mafFiles) ) {
  genes_min[[i]] <- unlist(strsplit(params$genes_min, split=',', fixed=TRUE))[[i]]
}

##### If two datasets are provided then also perform cohorts comparison with mafCompare() function
if ( length(mafFiles) == 2 ) {
  runMafCompare <- TRUE
} else {
  runMafCompare <- FALSE
}

##### Read in PURPLE output files if defined by user
# https://www.bioconductor.org/packages/devel/bioc/vignettes/maftools/inst/doc/oncoplots.html#032_custom_copy-number_table
nonSyn_list <- unlist(strsplit(params$nonSyn_list, split=',', fixed=TRUE))
cn_df <- NULL

if ( params$purple != "none" ){
  purpleFiles <- unlist(strsplit(params$purple, split=',', fixed=TRUE))
  
  for ( i in 1:length(mafFiles) ) {
    
    cn_df <- NULL
    
    ##### Deal with the newer PURPLE output format
    purple_files <- list.files(purpleFiles, pattern = "*.purple.cnv.gene.tsv", full.names = TRUE)
    
    if ( length(purple_files) > 0 ) {
      purple_dfs <- purple_files %>% map(~ mutate(read_tsv(.), fname = .))
      purple_df = purple_dfs %>% 
        bind_rows %>% 
        #filter(gene %in% c("KRAS", "TP53")) %>% 
        mutate(sample = fname %>% basename %>% str_replace('.purple.cnv.gene.tsv', '')) %>%
        rowwise() %>% mutate(cn = mean(minCopyNumber, maxCopyNumber)) %>% 
        select(gene, sample, cn) %>% 
        mutate(
          ampdel = case_when(
            (cn < params$purple_hd) ~ "HD",
            (cn >= params$purple_hd & cn < params$purple_loh) ~ "LOH",
            (cn >= params$purple_loh & cn < params$purple_amp) ~ "NA",
            (cn >= params$purple_amp) ~ "AMP",
            TRUE ~ "NA"
          ))
      
      ##### Remove entries with NAs
      purple_df = purple_df %>% 
        filter(ampdel %!in% "NA")
      cn_df <- purple_df
    }
    
    ##### Also deal with the older PURPLE output format
    purple_files <- list.files(purpleFiles, pattern = "*.purple.gene.cnv", full.names = TRUE)
    
    if ( length(purple_files) > 0 ) {
      purple_dfs <- purple_files %>% map(~ mutate(read_tsv(.), fname = .))
      purple_df = purple_dfs %>% 
        bind_rows %>% 
        #filter(gene %in% c("KRAS", "TP53")) %>% 
        mutate(sample = fname %>% basename %>% str_replace('.purple.cnv.gene.tsv', '')) %>%
        rowwise() %>% mutate(cn = mean(MinCopyNumber, MaxCopyNumber)) %>% 
        select(Gene, sample, cn) %>% 
        mutate(
          ampdel = case_when(
            (cn < params$purple_hd) ~ "HD",
            (cn >= params$purple_hd & cn < params$purple_loh) ~ "LOH",
            (cn >= params$purple_loh & cn < params$purple_amp) ~ "NA",
            (cn >= params$purple_amp) ~ "AMP",
            TRUE ~ "NA"
          ))
      
      ##### Remove entries with NAs
      purple_df = purple_df %>% 
        filter(ampdel %!in% "NA")
      
      colnames(purple_df) <- gsub("Gene", "gene", colnames(purple_df))
      cn_df <- rbind(cn_df, purple_df)
    }
    
    
    # Then feed the result into read.maf
    maf.data <- rbind(mafInfo[[i]]@data, mafInfo[[i]]@maf.silent)
    
    nonSyn_list <- unique(c(nonSyn_list, cn_df$ampdel))
    
    ##### Save maf object without CN results for the tcgaCompare fucntion frist
    mafInfo_tcgaCompare[[i]] <- maftools::read.maf(maf = maf.data, vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, clinicalData = NULL)
      
    ##### Add clinical information if provided
    if ( !is.na(clinicalInfo[[i]]) ) {
      mafInfo[[i]] <- maftools::read.maf(maf = maf.data, vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, cnTable = cn_df %>% select(Gene = gene, Sample_name = sample, CN = ampdel), clinicalData = clinicalInfo[[i]])
    } else {
      mafInfo[[i]] <- maftools::read.maf(maf = maf.data, vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, cnTable = cn_df %>% select(Gene = gene, Sample_name = sample, CN = ampdel), clinicalData = NULL)
    }
    
    ##### For some reason the read.maf() function adds "Tumor_Sample_barcode" column with "NA"s
    tsb <- as.vector(unlist(unique(mafInfo[[i]]@maf.silent[, "Tumor_Sample_Barcode"])))
    tsb <- tsb[ tsb != "" ]
    
    mafInfo[[i]] <- subsetMaf(maf = mafInfo[[i]], tsb = tsb, genes = NULL, fields = colnames(mafInfo[[i]]@data)[!colnames(mafInfo[[i]]@data) %in% "Tumor_Sample_barcode" ], query = NULL, mafObj = TRUE, includeSyn = TRUE, dropLevels=TRUE)
  }
}

##### Read in CNVkit output files if defined by user
# https://www.bioconductor.org/packages/devel/bioc/vignettes/maftools/inst/doc/oncoplots.html#032_custom_copy-number_table
if ( params$cnvkit != "none" ){
  cnvkitFiles <- unlist(strsplit(params$purple, split=',', fixed=TRUE))
  
  for ( i in 1:length(mafFiles) ) {
    
    cnvkit_files <- list.files(cnvkitFiles, pattern = "*-cnvkit-call.cns", full.names = T)

    cnvkit_df = cnvkit_files %>% 
    map(~ mutate(read_tsv(.), fname = .)) %>% 
    bind_rows %>% 
    mutate(sample = fname %>% basename %>% str_replace("-cnvkit-call.cns", "")) %>% 
    select(gene, sample, cn) %>%
    mutate(gene = gene %>% str_split(",")) %>% 
    unnest(gene)

    cnvkit_df = cnvkit_df %>%
      mutate(
        ampdel = case_when(
          (cn < params$cnvkit_hd) ~ "HD",
          (cn >= params$cnvkit_hd & cn < params$cnvkit_loh) ~ "LOH",
          (cn >= params$cnvkit_loh & cn < params$cnvkit_amp) ~ "NA",
          (cn >= params$cnvkit_amp) ~ "AMP",
          TRUE ~ "NA"
        ))
    
    ##### Remove entries with NAs
    ##### Add PURPLE results if provided
    if ( !is.null( cn_df) ) {
      cnvkit_df = cnvkit_df %>% 
        filter(ampdel %!in% "NA")
      cn_df <- rbind(cn_df, cnvkit_df)
    } else {
      cnvkit_df = cnvkit_df %>% 
        filter(ampdel %!in% "NA")
      cn_df <- cnvkit_df
    }
    
    # Then feed the result into read.maf
    maf.data <- rbind(mafInfo[[i]]@data, mafInfo[[i]]@maf.silent)
    
    nonSyn_list <- unique(c(nonSyn_list, cn_df$ampdel))
    
    ##### Save maf object without CN results for the tcgaCompare fucntion frist
    mafInfo_tcgaCompare[[i]] <- maftools::read.maf(maf = maf.data, vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, clinicalData = NULL)
    
    ##### Add clinical information if provided
    if ( !is.na(clinicalInfo[[i]]) ) {
      mafInfo[[i]] <- maftools::read.maf(maf = maf.data, vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, cnTable = cn_df %>% select(Gene = gene, Sample_name = sample, CN = ampdel), clinicalData = clinicalInfo[[i]])
    } else {
      mafInfo[[i]] <- maftools::read.maf(maf = maf.data, vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, cnTable = cn_df %>% select(Gene = gene, Sample_name = sample, CN = ampdel), clinicalData = NULL)
    }
    
    ##### For some reason the read.maf() function adds "Tumor_Sample_barcode" column with "NA"s
    ##### For some reason the read.maf() function adds "Tumor_Sample_barcode" column with "NA"s
    tsb <- as.vector(unlist(unique(mafInfo[[i]]@maf.silent[, "Tumor_Sample_Barcode"])))
    tsb <- tsb[ tsb != "" ]
    
    mafInfo[[i]] <- subsetMaf(maf = mafInfo[[i]], tsb = tsb, genes = NULL, fields = colnames(mafInfo[[i]]@data)[!colnames(mafInfo[[i]]@data) %in% "Tumor_Sample_barcode" ], query = NULL, mafObj = TRUE, includeSyn = TRUE, dropLevels=TRUE)
  }
}

##### Read in GISTIC output files if defined by user
runGistic <- FALSE

##### Create a list to store GISTIC info for individual datasets
gisticInfo <- vector("list", length(mafFiles))
names(gisticInfo) <- datasets.list
  
if ( params$gistic != "none" ){
  gisticFiles <- unlist(strsplit(params$gistic, split=',', fixed=TRUE))
  
  for ( i in 1:length(mafFiles) ) {
    
    ##### List required files in the GISTIC output directory
    gisticInfo[[i]]$all.lesions <- list.files(params$gistic, pattern="all_lesions.conf", all.files=FALSE,
      full.names=TRUE)
    gisticInfo[[i]]$amp.genes <- list.files(params$gistic, pattern="amp_genes.conf", all.files=FALSE,
      full.names=TRUE)
    gisticInfo[[i]]$del.genes <- list.files(params$gistic, pattern="del_genes.conf", all.files=FALSE,
      full.names=TRUE)
    gisticInfo[[i]]$scores.gis <- list.files(params$gistic, pattern="scores.gistic", all.files=FALSE,
      full.names=TRUE)
  
    ##### Check if GISTIC output files exist
    if ( !is.na(gisticInfo[[i]]$all.lesions[1]) && !is.na(gisticInfo[[i]]$amp.genes[1]) && !is.na(gisticInfo[[i]]$del.genes[1]) && !is.na(gisticInfo[[i]]$scores.gis[1]) ) {
      
    runGistic <- TRUE
    gisticInfo[[i]]$status <- TRUE
    
    gisticInfo[[i]]$summary = readGistic(gisticAllLesionsFile = gisticInfo[[i]]$all.lesions, gisticAmpGenesFile = gisticInfo[[i]]$amp.genes, gisticDelGenesFile = gisticInfo[[i]]$del.genes, gisticScoresFile = gisticInfo[[i]]$scores.gis)
    
    nonSyn_list <- unique(c(nonSyn_list, gisticInfo[[i]]$summary@classCode[ gisticInfo[[i]]$summary@classCode != "" ]))
    
    } else {
      gisticInfo[[i]]$status <- FALSE
    }
  }
} else {
  for ( i in 1:length(mafFiles) ) {
      gisticInfo[[i]]$status <- FALSE
  }
}

##### Change the column to be used to indicate samples' IDs
if ( !is.null(params$samples_id_cols) ) {
  samples_id_cols <- make.names(unlist(strsplit(params$samples_id_cols, split=',', fixed=TRUE)))
  
  for ( i in 1:length(mafFiles) ) {
    
    ##### Check if use-defined column name exists
    if ( samples_id_cols[i] %in% names(mafInfo[[i]]@data) && samples_id_cols[i] != "Tumor_Sample_Barcode" ) {
      
      ##### Change the use-defined column with samples' IDs to "Tumor_Sample_Barcode". If this column name already exist, then renames it to "Tumor_Sample_Barcode.orig"
      maf.data <- rbind(mafInfo[[i]]@data, mafInfo[[i]]@maf.silent)
      
      if ( samples_id_cols[i] %in% names(maf.data) ) {
        names(maf.data) <- gsub("Tumor_Sample_Barcode", "Tumor_Sample_Barcode.orig", names(maf.data))
        names(maf.data) <- gsub(samples_id_cols[i], "Tumor_Sample_Barcode", names(maf.data))
      } else {
        cat(paste0("\nColumn \"", samples_id_cols[i], "\" does not exist in MAF file ", mafFiles[i], "!\n\n"))
      }
      
      ##### Now read the data with changess column names as a maf object
      if ( gisticInfo[[i]]$status ) {
        
        ##### Add clinical information if provided
        if ( !is.na(clinicalInfo[[i]]) ){
          mafInfo[[i]] <- maftools::read.maf(maf = maf.data, vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, gisticAllLesionsFile = gisticInfo[[i]]$all.lesions, gisticAmpGenesFile = gisticInfo[[i]]$amp.genes, gisticDelGenesFile = gisticInfo[[i]]$del.genes, gisticScoresFile = gisticInfo[[i]]$scores.gis, verbose = FALSE, clinicalData = clinicalInfo[[i]])
        } else {
          mafInfo[[i]] <- maftools::read.maf(maf = maf.data, vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, gisticAllLesionsFile = gisticInfo[[i]]$all.lesions, gisticAmpGenesFile = gisticInfo[[i]]$amp.genes, gisticDelGenesFile = gisticInfo[[i]]$del.genes, gisticScoresFile = gisticInfo[[i]]$scores.gis, verbose = FALSE, clinicalData = NULL)
        }
        
        ##### Save maf object without GISTIC results as well for the tcgaCompare fucntion
        mafInfo_tcgaCompare[[i]] <- maftools::read.maf(maf = maf.data, vc_nonSyn = onSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, clinicalData = NULL)
        
      } else {
        
        ##### Add clinical information if provided
        if ( !is.na(clinicalInfo[[i]]) ){
          mafInfo[[i]] <- maftools::read.maf(maf = maf.data, vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, clinicalData = clinicalInfo[[i]])
        } else {
          mafInfo[[i]] <- maftools::read.maf(maf = maf.data, vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, clinicalData = NULL)
        }
      }
    }
  }
} else {
  
  ##### Add GISTIC results if defined by user
  for ( i in 1:length(mafFiles) ) {
    
    if ( gisticInfo[[i]]$status ) {
      
      ##### Add clinical information if provided
      if ( !is.na(clinicalInfo[[i]]) ){
        mafInfo[[i]] <- maftools::read.maf(maf = mafFiles[i], vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, gisticAllLesionsFile = gisticInfo[[i]]$all.lesions, gisticAmpGenesFile = gisticInfo[[i]]$amp.genes, gisticDelGenesFile = gisticInfo[[i]]$del.genes, gisticScoresFile = gisticInfo[[i]]$scores.gis, verbose = FALSE, clinicalData = clinicalInfo[[i]])
      } else {
        mafInfo[[i]] <- maftools::read.maf(maf = mafFiles[i], vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, gisticAllLesionsFile = gisticInfo[[i]]$all.lesions, gisticAmpGenesFile = gisticInfo[[i]]$amp.genes, gisticDelGenesFile = gisticInfo[[i]]$del.genes, gisticScoresFile = gisticInfo[[i]]$scores.gis, verbose = FALSE)
      }
      
      ##### Save maf object without GISTIC results as well for the tcgaCompare fucntion
      mafInfo_tcgaCompare[[i]] <- maftools::read.maf(maf = mafFiles[i], vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE, clinicalData = NULL)
    }
  }
}

##### Create directory for output files
outDir <- paste(params$maf_dir, params$out_folder, "results", sep = "/")
if ( !file.exists(params$out_folder) ){
  dir.create(outDir, recursive=TRUE)
}

##### Read in list of genes of interest of specified
goi <- vector("list", length(mafFiles))
names(goi) <- datasets.list

if ( params$genes_list != "none" ){
  genes_lists <- unlist(strsplit(params$genes_list, split=',', fixed=TRUE))
  
  for ( i in 1:length(mafFiles) ) {
    goi[[i]] <- unique(read.table(genes_lists[i], sep="\t", as.is=TRUE, header=FALSE, row.names=NULL)[,1])
  }
}
```

```{r include_exclude_samples, comment=NA, message=FALSE, warning=FALSE, results='hide'}
##### Include user-defined samples(s) for the analysis
if ( params$samples_list != "none" ) {
  for ( i in 1:length(mafFiles) ) {
      
    ##### Read in list of samples to be included
    inclsamples.df <- read.table(params$samples_list, sep="\t", as.is=TRUE, header=TRUE, row.names=NULL)
    samples2keep <- unique(inclsamples.df[,"Tumor_Sample_Barcode"])
    
    ##### Subset the maf object to include only use-defined sample(s)
    ##### Initially don't save the subset output as maf object, as this will exlude samples with no non-synonymous mutations from the summary
    mafInfo[[i]] <- subsetMaf(maf = mafInfo[[i]], tsb = as.vector(samples2keep), genes = NULL, fields = NULL, query = NULL, mafObj = FALSE, includeSyn = TRUE, dropLevels=TRUE)
    
    ##### Now read the data subset as a maf object
    if ( gisticInfo[[i]]$status ) {
      mafInfo[[i]] <- maftools::read.maf(maf = mafInfo[[i]], vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, gisticAllLesionsFile = gisticInfo[[i]]$all.lesions, gisticAmpGenesFile = gisticInfo[[i]]$amp.genes, gisticDelGenesFile = gisticInfo[[i]]$del.genes, gisticScoresFile = gisticInfo[[i]]$scores.gis, verbose = FALSE)
      
    } else {
      mafInfo[[i]] <- maftools::read.maf(maf = mafInfo[[i]], vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE)
    }
  }
}

if ( params$samples_blacklist != "none" ) {
  for ( i in 1:length(mafFiles) ) {
      
    ##### Read in list of samples to be excluded and convert to char vector
    exclsamples.df <- read.table(params$samples_blacklist, sep="\t", as.is=TRUE, header=TRUE, row.names=NULL)
    exclsamples_char <- unique(exclsamples.df$Tumor_Sample_Barcode)
    
    tumor_sample_barcodes <- mafInfo[[i]]@data$Tumor_Sample_Barcode
    unique_tumor_barcodes <- unique(tumor_sample_barcodes)
    samples2keep <- unique_tumor_barcodes[!unique_tumor_barcodes %in% exclsamples_char]
    
    #samples2keep <- unlist(unique( mafInfo[[i]]@data[, "Tumor_Sample_Barcode"])[ unique(mafInfo[[i]]@data[, "Tumor_Sample_Barcode"]) %!in% exclsamples_char ])
  
    ##### Subset the maf object to exclude use-defined sample(s)
    ##### Initially don't save the subset output as maf object, as this will exclude samples with no non-synonymous mutations from the summary
    #mafInfo[[i]] <- filterMaf(maf = mafInfo[[i]], tsb = exclsamples_char, mafObj = FALSE)
    mafInfo[[i]] <- subsetMaf(maf = mafInfo[[i]], tsb = as.vector(samples2keep), genes = NULL, fields = NULL, query = NULL, mafObj = FALSE, includeSyn = TRUE, dropLevels=TRUE)
    
    ##### Now read the data subset as a maf object
    if ( gisticInfo[[i]]$status ) {
      
      mafInfo[[i]] <- maftools::read.maf(maf = mafInfo[[i]], vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, gisticAllLesionsFile = gisticInfo[[i]]$all.lesions, gisticAmpGenesFile = gisticInfo[[i]]$amp.genes, gisticDelGenesFile = gisticInfo[[i]]$del.genes, gisticScoresFile = gisticInfo[[i]]$scores.gis, verbose = FALSE)
      
    } else {
      mafInfo[[i]] <- maftools::read.maf(maf = mafInfo[[i]], vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE)
    }
  }
}
```

```{r exclude_genes, comment=NA, message=FALSE, warning=FALSE}
##### Exclude user-defined gene(s) from the analysis
if ( params$genes_blacklist != "none" ) {
  for ( i in 1:length(mafFiles) ) {
    
    ##### Read in list of genes to be excluded
    exclgenes <- unique(read.table(params$genes_blacklist, sep="\t", as.is=TRUE, header=FALSE, row.names=NULL)[,1])
    
    genes2keep <- unique( mafInfo[[i]]@data$Hugo_Symbol)[ unique( mafInfo[[i]]@data$Hugo_Symbol) %!in% exclgenes ]
  
    ##### Subset the maf object to exclude use-defined gene(s)
    ##### Initially don't save the subset output as maf object, as this will exlude samples with no non-synonymous mutations from the summary
    mafInfo[[i]] <- subsetMaf(maf = mafInfo[[i]], tsb = NULL, genes = genes2keep, fields = NULL, query = NULL, mafObj = FALSE, includeSyn = TRUE)
    
    ##### Now read the data subset as a maf object
    if ( gisticInfo[[i]]$status ) {
      
      mafInfo[[i]] <- maftools::read.maf(maf = mafInfo[[i]], vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, gisticAllLesionsFile = gisticInfo[[i]]$all.lesions, gisticAmpGenesFile = gisticInfo[[i]]$amp.genes, gisticDelGenesFile = gisticInfo[[i]]$del.genes, gisticScoresFile = gisticInfo[[i]]$scores.gis, verbose = FALSE)
      
    } else {
      mafInfo[[i]] <- maftools::read.maf(maf = mafInfo[[i]], vc_nonSyn = nonSyn_list, removeDuplicatedVariants = params$remove_duplicated_variants, verbose = FALSE)
    }
  }
}
```

```{r silent_variants, comment = NA, message=FALSE, warning=FALSE}
##### Identify and record samples with no non-synonymous mutations
##### Prepare list to store all samples and samples with > 0 non-synonymous variants
MAF_samples <- vector("list", length(datasets.list))
names(MAF_samples) <- datasets.list
MAF_samples.silent.df <- NULL

##### Loop through MAF files
for ( i in 1:length(mafFiles) ) {
  
  ##### Identify samples with no non-synonymours variants according to corresponding MAF file
  MAF_samples[[i]]$all <- unlist(unique(mafInfo[[i]]@maf.silent[, "Tumor_Sample_Barcode"]))
  MAF_samples[[i]]$nonsyn <- unlist(maftools::getSampleSummary(mafInfo[[i]])[, "Tumor_Sample_Barcode"])
  MAF_samples[[i]]$silent <-  MAF_samples[[i]]$all[ MAF_samples[[i]]$all %!in% MAF_samples[[i]]$nonsyn  ]
  
  ##### Check if there are any samples with no non-synonymours variants. If so, add them to data frame
  if ( length(MAF_samples[[i]]$silent) > 0 ) {
    for ( sample in MAF_samples[[i]]$silent ) {
      
      MAF_samples.silent.df <- rbind( MAF_samples.silent.df, cbind( datasets.list[i], sample))
    }
    colnames(MAF_samples.silent.df) <- c("Dataset", "Sample")
  }
}

##### List silent variants classifications
silent_categories <- NULL

for ( i in 1:length(mafFiles) ) {
  silent_categories <- unique( c(silent_categories, mafInfo[[i]]@maf.silent$Variant_Classification) )
}
```

***

## Summary

### Tables {.tabset}

#### Overall summary

Table(s) with basic information about each dataset based on data in corresponding MAF file(s).

```{r overll_summary, comment = NA, message=FALSE, warning=FALSE}
dir.create(paste(outDir, "summary", datasets.df[[i]], sep="/"), recursive = TRUE)

##### Write overall summary into a file
for ( i in 1:length(mafFiles) ) {
  write.mafSummary(maf = mafInfo[[i]], basename = paste(outDir, "summary", datasets.df[[i]], sep="/"))
}

##### Assign different colour to individual datasets
datasets.colour <- getDatasetsColours(datasets.list)

##### Create a new workbook
wb <- createWorkbook("MAF_summary.xlsx")

##### Add worksheets, one for each dataset
for ( i in 1:length(mafFiles) ) {
    addWorksheet(wb, substring(datasets.list[i], 0, 31), tabColour = datasets.colour[[1]][i])
    writeData(wb, sheet = i, mafInfo[[i]]@summary)
}

saveWorkbook(wb, paste(outDir, "MAF_summary.xlsx", sep="/"), overwrite = TRUE)

##### Present a MAF file summary table in the html report
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- DT::datatable( data = mafInfo[[i]]@summary, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "none", extensions = 'Buttons', options = list(pageLength = nrow(mafInfo[[i]]@summary), dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis')), width = 800,  escape = FALSE ) %>%
        DT::formatStyle( columns = names(mafInfo[[i]]@summary), 'text-align' = 'center' ) %>%
        DT::formatRound( columns = "Mean", 2)
}

##### Print a list of htmlwidgets
widges.list

##### Add extra lines to make sure that this section doesn't overlap with the next one
cat("\n\n\n")
```

```{r maf_fields, comment = NA, message=FALSE, warning=FALSE}
##### Additionally ceate an excel spreadsheet listing all fields (columns) in the individaul MAF files
##### Create a new workbook
wb <- createWorkbook("MAF_fields.xlsx")

##### Add worksheets, one for each dataset
for ( i in 1:length(mafFiles) ) {
    addWorksheet(wb, substring(datasets.list[i], 0, 31), tabColour = datasets.colour[[1]][i])
    writeData(wb, sheet = i, maftools::getFields(mafInfo[[i]]))
}

saveWorkbook(wb, paste(outDir, "MAF_fields.xlsx", sep="/"), overwrite = TRUE)
```

***

#### Samples summary {.tabset}

##### Samples with non-synonymous variant(s)

Table(s) summarising samples in individual datasets. Each table contains per-sample information (rows) about *number of different types of mutations* (columns), as well as the *total number of mutations* reported in corresponding MAF file. <span style="color:#ff0000">NOTE</span>, only samples with detected **non-synonymous variant(s)** are reported in the table below.

```{r sample_summary, comment = NA, message=FALSE, warning=FALSE}
##### Write samples summary into a file
##### Create a new workbook
wb <- createWorkbook("MAF_sample_summary.xlsx")

##### Add worksheets, one for each dataset
for ( i in 1:length(mafFiles) ) {
    addWorksheet(wb, substring(datasets.list[i], 0, 31), tabColour = datasets.colour[[1]][i])
    writeData(wb, sheet = i, maftools::getSampleSummary(mafInfo[[i]]))
}

saveWorkbook(wb, paste(outDir, "MAF_sample_summary.xlsx", sep="/"), overwrite = TRUE)

##### Present a sample table in the html report
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- DT::datatable( data = maftools::getSampleSummary(mafInfo[[i]]), caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
        DT::formatStyle( columns = names(maftools::getSampleSummary(mafInfo[[i]])), 'text-align' = 'center' )
}

##### Print a list of htmlwidgets
widges.list
```

`r if ( !is.null(MAF_samples.silent.df) ) { c("***") }`

`r if ( !is.null(MAF_samples.silent.df) ) { c("##### Samples with no non-synonymous variants") }`

`r if ( !is.null(MAF_samples.silent.df) ) { c("The table below lists sample(s) in which **no non-synonymous variants** were detected and hence will not be included in the summary tables/plots.") }`

```{r sample_no_nonsynonymous, comment = NA, message=FALSE, warning=FALSE}
##### report samples with no non-synonymous variants according to corresponding MAF file
if ( !is.null(MAF_samples.silent.df) ) {
  DT::datatable( data = MAF_samples.silent.df, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong("Samples with no non-synonymours variants detected")), filter = "top", extensions = 'Buttons', options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy'), scrollX = TRUE, deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
        DT::formatStyle( columns = names(MAF_samples.silent.df), 'text-align' = 'center' )
}

##### Add extra lines to make sure that this section doesn't overlap with the next one
cat("\n\n\n")
```


`r if ( params$samples_blacklist != "none" ) { c("***") }`

`r if ( params$samples_blacklist != "none" ) { c("##### Excluded samples") }`

`r if ( params$samples_blacklist != "none" ) { c("List of samples excluded from the analysis.") }`

```{r excluded_samples_table, comment = NA, message=FALSE, warning=FALSE}
##### Present a samples table in the html report
if ( params$samples_blacklist != "none" ) {
  DT::datatable(data = exclsamples.df, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;'), filter = "top", extensions = c('Buttons','FixedColumns','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, fixedColumns = list(leftColumns = 1), deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
      DT::formatStyle( columns = names(exclsamples.df), 'text-align' = 'center' )
}
```

`r if ( params$clinical_info != "none" ) { c("***") }`

`r if ( params$clinical_info != "none" ) { c("#### Samples annotation") }`

`r if ( params$clinical_info != "none" ) { c("Sample annotations for individual dataset(s).") }`

```{r samples_annot_table, comment = NA, message=FALSE, warning=FALSE}
##### Present a samples table in the html report
if ( params$clinical_info != "none" ) {
  
  ##### Create a list for htmlwidgets
  widges.list <- htmltools::tagList()

  for ( i in 1:length(mafFiles) ) {
    widges.list[[i]] <- DT::datatable( data = mafInfo[[i]]@clinical.data, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
      DT::formatStyle( columns = names(clinicalInfo[[i]]), 'text-align' = 'center' )
  }
  
  ##### Print a list of htmlwidgets
  widges.list
}
```

***

#### Genes summary {.tabset}

`r if ( params$genes_list != "none" ) { c("##### Genes of interest") }`

`r if ( params$genes_list != "none" ) { c("Table(s) summarising ***genes of interest*** in individual datasets. Each table contains per-gene information (rows) about *number of different types of mutations* (columns), as well as the *total number of mutations* reported in corresponding MAF file. The last two columns contain the *number of samples with mutations/alterations* in the corresponding gene.") }`

`r if ( params$genes_list != "none" ) { c("<span style=\"color:#ff0000\">NOTE</span>: Only genes of interest with **non-synonymous variants** are presented in the table. Expand buttons below to see the full list of genes of interest and those which have no non-synonymous variants detected.") }`

`r if ( params$genes_list != "none" ) { c("<details>") }`
`r if ( params$genes_list != "none" ) { c("<summary>Full list of genes of interest</summary>") }`
`r if ( params$genes_list != "none" ) { c("<font size=\"2\">") }`
`r if ( params$genes_list != "none" ) { c(paste0("*", goi[[i]], "*")) }`
`r if ( params$genes_list != "none" ) { c("</font>") }`
`r if ( params$genes_list != "none" ) { c("</details>") }`

`r if ( params$genes_list != "none" ) { c("<details>") }`
`r if ( params$genes_list != "none" ) { c("<summary>Genes of interest with no non-synonymous variants</summary>") }`
`r if ( params$genes_list != "none" ) { c("<font size=\"2\">") }`
`r if ( params$genes_list != "none" ) { c(paste0("*", goi[[i]][ goi[[i]] %!in% maftools::getGeneSummary(mafInfo[[i]])[ maftools::getGeneSummary(mafInfo[[i]])$Hugo_Symbol %in% goi[[i]], ]$Hugo_Symbol ], "*")) }`
`r if ( params$genes_list != "none" ) { c("</font>") }`
`r if ( params$genes_list != "none" ) { c("</details>") }`

```{r goi_table, comment = NA, message=FALSE, warning=FALSE}
##### Present a gene table in the html report
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- DT::datatable(data = maftools::getGeneSummary(mafInfo[[i]])[ maftools::getGeneSummary(mafInfo[[i]])$Hugo_Symbol %in% goi[[i]], ], caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','FixedColumns','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, fixedColumns = list(leftColumns = 2), deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
        DT::formatStyle( columns = names(maftools::getGeneSummary(mafInfo[[i]])), 'text-align' = 'center' )
}

##### Print a list of htmlwidgets
widges.list
```

`r if ( params$genes_list != "none" ) { c("***") }`

##### Mutated genes

Table(s) summarising mutated genes in individual datasets. Each table contains per-gene information (rows) about *number of different types of mutations* (columns), as well as the *total number of mutations* reported in corresponding MAF file. The last two columns contain the *number of samples with mutations/alterations* in the corresponding gene.

```{r gene_summary, comment = NA, message=FALSE, warning=FALSE}
##### Write gene summary into a file
##### Create a new workbook
wb <- createWorkbook("MAF_gene_summary.xlsx")

##### Add worksheets, one for each dataset
for ( i in 1:length(mafFiles) ) {
    addWorksheet(wb, substring(datasets.list[i], 0, 31), tabColour = datasets.colour[[1]][i])
    writeData(wb, sheet = i, maftools::getGeneSummary(mafInfo[[i]]))
}

saveWorkbook(wb, paste(outDir, "MAF_gene_summary.xlsx", sep="/"), overwrite = TRUE)

##### Present a gene table in the html report
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- DT::datatable(data = maftools::getGeneSummary(mafInfo[[i]]), caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','FixedColumns','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, fixedColumns = list(leftColumns = 2), deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
        DT::formatStyle( columns = names(maftools::getGeneSummary(mafInfo[[i]])), 'text-align' = 'center' )
}

##### Print a list of htmlwidgets
widges.list
```

***

`r if ( params$genes_blacklist != "none" ) { c("##### Excluded genes") }`

`r if ( params$genes_blacklist != "none" ) { c("List of genes excluded from the analysis.") }`

```{r excluded_genes_table, comment = NA, message=FALSE, warning=FALSE}
##### Present a gene table in the html report
if ( params$genes_blacklist != "none" ) {
  DT::datatable(data = data.frame(Gene = exclgenes), caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;'), filter = "top", extensions = c('Buttons','FixedColumns','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy'), scrollX = TRUE, deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
    DT::formatStyle( columns = "Gene", 'text-align' = 'center' )
}
```

`r if ( params$genes_blacklist != "none" ) { c("***") }`

```{r top_genes_threshold, comment = NA, message=FALSE, warning=FALSE}
##### Check if the use-defined number of genes to present is not higher than the number of genes with non-synonymous mutations reported in individual MAFs
top_genes_no <- vector("list", length(mafFiles))
names(top_genes_no) <- datasets.list

mut_freq <- vector("list", length(mafFiles))
names(mut_freq) <- datasets.list

##### Set height for oncoplots
oncoplot_height <-  vector("list", length(mafFiles))
names(oncoplot_height) <- datasets.list

for ( i in 1:length(mafFiles) ) {
  
  ##### Get mutation frequency for each gene
  mut_freq[[i]] <- round(getGeneSummary(mafInfo[[i]])$MutatedSamples/as.numeric(mafInfo[[i]]@summary[3,"summary"])*100, digits=0)
  names(mut_freq[[i]]) <- getGeneSummary(mafInfo[[i]])$Hugo_Symbol
  
  ##### Get the number of genes that have mutations in the defined minimal percentage of samples (4% as default)
  top_genes_no[[i]] <- length( mut_freq[[i]][ mut_freq[[i]] >= as.numeric(genes_min[[i]]) ] )
  
  ##### Set height for oncoplots
  if ( top_genes_no[[i]] > 1000 ) {
    oncoplot_height[[i]] <- 3000
  } else if ( top_genes_no[[i]] > 100 ) {
    oncoplot_height[[i]] <- 100*top_genes_no[[i]]/3
  } else if ( top_genes_no[[i]] > 19 ) {
    oncoplot_height[[i]] <- 100*top_genes_no[[i]]/2
  } else if ( top_genes_no[[i]] > 10 ) {
    oncoplot_height[[i]] <- 100*top_genes_no[[i]]/1.5
  } else {
    oncoplot_height[[i]] <- 700
  }
}

##### Define colours for each Variant_Classification if the default list of non-synonymous variants is not used (from http://bioconductor.org/packages/devel/bioc/vignettes/maftools/inst/doc/oncoplots.html#02_changing_colors_for_variant_classifications and "get_vcColors" function in https://rdrr.io/github/PoisonAlien/maftools/src/R/oncomatrix.R)
#if ( !all(nonSyn_list %in% c( "Frame_Shift_Del","Frame_Shift_Ins","Splice_Site","Translation_Start_Site","Nonsense_Mutation","Nonstop_Mutation","In_Frame_Del","In_Frame_Ins","Missense_Mutation" ) ) ) {
  
  ##### Get the default colours first
  vc_cols = c(RColorBrewer::brewer.pal(11, name = "Paired"), RColorBrewer::brewer.pal(11,name = "Spectral")[1:3],'#000000', '#ee82ee', '#ee82ee', '#4169e1', '#27408b', '#87ceeb', '#9acd32', '#8b008b', '#6e6456', '#535c68')
  #vc_cols = grDevices::adjustcolor(col = vc_cols, alpha.f = 1)
  names(vc_cols) = c('Frame_Shift_Del','Missense_Mutation','Intron','Silent','Frame_Shift_Ins','In_Frame_Ins','Splice_Site','In_Frame_Del','Nonsense_Mutation','Nonstop_Mutation','IGR','ITD','RNA','Translation_Start_Site',"Multi_Hit", 'Amp', 'AMP', 'Del', 'HD', 'LOH', 'Fusion', 'SV', 'Complex_Event', 'pathway')
  
  ##### Correct colour for "In_Frame_Del"
  #vc_cols[ names(vc_cols) == "In_Frame_Del"] <- "#976726"
  
  ##### Correct colour for "In_Frame_Ins"
  #vc_cols[ names(vc_cols) == "In_Frame_Ins"] <- "#7d7d7d"
  
  ##### ... now remove unwanted variant types
  #vc_cols <- vc_cols[ c(names(vc_cols)) %in% c( "Frame_Shift_Del","Frame_Shift_Ins","Splice_Site","Translation_Start_Site","Nonsense_Mutation","Nonstop_Mutation","In_Frame_Del","In_Frame_Ins","Missense_Mutation","Multi_Hit" ) ]
  
  ##### ... add colours for additional variant types
  vc_missing <- nonSyn_list[ nonSyn_list %!in% names(vc_cols) ]
  
  if ( length(vc_missing) > 0 ) {
    vc_missing_cols = RColorBrewer::brewer.pal(11,name = "Spectral")[4:(3+length(vc_missing))]
    names(vc_missing_cols) = vc_missing
    
    ##### Combined colours for defualt and additional variant types
    vc_cols <- c(vc_cols, vc_missing_cols)
  }
#} else {
#  vc_cols = NULL
#}
 
##### Define colours for samples annotation
clinicalFeatures_cols <- NULL

for ( i in 1:length(mafFiles) ) {
  
  groups <- unlist(unique(as.data.frame(mafInfo[[i]]@clinical.data)[ clinicalFeatures[[i]] ]))

  if ( length(groups) < 3 ) {
    clinicalFeatures_cols[[i]] <- c("#1F78B4","#E31A1C")
    clinicalFeatures_cols[[i]]  <- clinicalFeatures_cols[[i]][ 1:length(groups) ]
    names(clinicalFeatures_cols[[i]]) <- groups
    clinicalFeatures_cols[[i]] <- list(clinicalFeatures_cols[[i]])
    names(clinicalFeatures_cols[[i]]) <- clinicalFeatures[[i]]
  } else {
    clinicalFeatures_cols[[i]] <- c("#1F78B4","#E31A1C", RColorBrewer::brewer.pal(length(groups),name = "Accent"))
    clinicalFeatures_cols[[i]]  <- clinicalFeatures_cols[[i]][ 1:length(groups) ]
    names(clinicalFeatures_cols[[i]]) <- groups
    clinicalFeatures_cols[[i]] <- list(clinicalFeatures_cols[[i]])
    names(clinicalFeatures_cols[[i]]) <- clinicalFeatures[[i]]
  }
}
```

```{r goi, comment = NA, message=FALSE, warning=FALSE}
##### Deal with additional genes of interest (if specified)
##### Check if the use-defined genes of interest are already present withing the top most frequently mutated genes
if ( params$genes_list != "none" ) {
  
  genes_list_goi <- vector("list", length(mafFiles))
  names(genes_list_goi) <- datasets.list
  
  genes_list_goi.silent <- vector("list", length(mafFiles))
  names(genes_list_goi.silent) <- datasets.list
  
  genes_list_goi.nonsyn <- vector("list", length(mafFiles))
  names(genes_list_goi.nonsyn) <- datasets.list
  
  goi_absent <- vector("list", length(mafFiles))
  names(goi_absent) <- datasets.list
  
  top_genes_goi <- vector("list", length(mafFiles))
  names(top_genes_goi) <- datasets.list
  
  top_genes <- vector("list", length(mafFiles))
  names(top_genes) <- datasets.list
  
  ##### Set height for oncoplots
  oncoplot_goi_height <-  vector("list", length(mafFiles))
  names(oncoplot_goi_height) <- datasets.list
  
  for ( i in 1:length(mafFiles) ) {
    
    ##### Extract the user-defined number of top mutated genes
    top_genes[[i]] <- maftools::getGeneSummary(mafInfo[[i]])$Hugo_Symbol[1:top_genes_no[[i]]]
    
    ##### Record genes of interest that are not present in the MAF file
    goi_absent[[i]] <- goi[[i]][ goi[[i]] %!in% maftools::getGeneSummary(mafInfo[[i]])$Hugo_Symbol ]
    goi_absent[[i]] <- goi_absent[[i]][ goi_absent[[i]] %!in% mafInfo[[i]]@maf.silent$Hugo_Symbol ]
    
    genes_list_goi.silent[[i]] <- goi[[i]][ goi[[i]] %in% mafInfo[[i]]@maf.silent$Hugo_Symbol ]
    
    ##### Keep the genes of interest and most frequently mutated genes separately
    top_genes_goi[[i]] <- goi[[i]][ goi[[i]] %in% top_genes[[i]] ]
    
    ##### Remove genes absent in MAF from the genes of interest list
    genes_list_goi[[i]] <- goi[[i]][ goi[[i]] %!in% goi_absent[[i]] ]
    
    ##### Keep genes with non-synonymous mutations
    genes_list_goi.nonsyn[[i]] <- genes_list_goi[[i]][ genes_list_goi[[i]] %in% mafInfo[[i]]@data$Hugo_Symbol ]
    
    ##### Set height for oncoplots
    if ( length(goi[[i]]) > 1000 ) {
      oncoplot_goi_height[[i]] <- 3000
    } else if ( length(goi[[i]]) > 100 ) {
      oncoplot_goi_height[[i]] <- 100*length(goi[[i]])/3
    } else if  ( length(goi[[i]]) > 19 ) {
      oncoplot_goi_height[[i]] <- 100*length(goi[[i]])/2
    } else if  ( length(goi[[i]]) > 10 ) {
      oncoplot_goi_height[[i]] <- 100*length(goi[[i]])/1.5
    } else {
      oncoplot_goi_height[[i]] <- 700
    }
  }
}

##### Add extra lines to make sure that this section doesn't overlap with the next one
cat("\n\n\n")
```

### Plots {.tabset}

#### Summary

A per-MAF file summary including *frequency of various mutation/SNV types/classes* (top panel), the *number of variants in each sample* as a stacked bar-plot (bottom-left) and *variant types* as a box-plot (bottom-middle), as well as the *frequency of different mutation types* for the **`r sum(unlist(top_genes_no))`** most f% of patients, respectively) (bottom-right). The horizontal dashed line in stacked bar-plot represents median number of variants across the dataset.

```{r maf_summary_plot, comment = NA, message=FALSE, warning=FALSE, fig.width = 9, fig.height = 10, results="asis"}
###### Generate separate plot for each dataset
for ( i in 1:length(mafFiles) ) {

  cat(paste("\n\n <b>", datasets.list[i], "</b>\n\n", sep=" "))
  
  ##### Plotting MAF summary
  par(mar=c(4,4,2,0.5), oma=c(1.5,2,2,1))
  maftools::plotmafSummary(maf = mafInfo[[i]], top = top_genes_no[[i]], rmOutlier = TRUE, addStat = 'median', dashboard = TRUE, titvRaw = FALSE, color = vc_cols)
  mtext("MAF summary", outer=TRUE,  cex=1, line=-0.5)
  
  cat("<br/><br/>")
}
```

***

#### Oncoplot {.tabset}

##### Recurrently mutated genes

Oncoplot(s) illustrating different types of mutations observed across samples for the **most frequently mutated genes** (mutated in at least `r gsub(",", "% and ", params$genes_min)`% of patients, respectively). The side- and top bar-plots present the frequency of mutations in these genes and in individual samples, respectively. The bottom bar-plot illustrates the overall distribution of the six different conversions (*C>A*, *C>G*, *C>T*, *T>C*, *T>A* and *T>G*) across all samples in each dataset. `r if ( params$clinical_features != "none" ) { c("Sample annotations are also provided.") }`
`r if ( params$purple != "none" ) { c("<span style=\"color:#ff0000\">NOTE</span>, copy-number (CN) alterations are derived from *[PURPLE](https://github.com/hartwigmedical/hmftools/tree/master/purity-ploidy-estimator){target=\"_blank\"}* programme.") }`

`r if ( params$purple != "none" ) { c("<details>") }`
`r if ( params$purple != "none" ) { c("<summary>CNV definitions</summary>") }`
`r if ( params$purple != "none" ) { c("<font size=\"2\">") }`
`r if ( params$purple != "none" ) { paste0("HD: CN < ", params$purple_hd, "\n") }`
`r if ( params$purple != "none" ) { paste0("LOH: CN >= ", params$purple_hd, " & < ", params$purple_loh, "\n") }`
`r if ( params$purple != "none" ) { c("AMP: CN >= ", params$purple_amp) }`
`r if ( params$purple != "none" ) { c("</font>") }`
`r if ( params$purple != "none" ) { c("</details>") }`
`r if ( params$purple != "none" ) { c("</br>") }`

`r if ( params$gistic != "none" ) { c("<span style=\"color:#ff0000\">NOTE</span>, copy-number (CN) alterations are derived from *[GISTIC](http://software.broadinstitute.org/cancer/software/genepattern/modules/docs/GISTIC_2.0){target=\"_blank\"}* programme.") }`

```{r maf_oncoplot, comment = NA, message=FALSE, warning=FALSE, results="asis"}
##### Save oncoplot into the file
##### Generate separate plot for each dataset

for ( i in 1:length(mafFiles) ) {
  
  cat(paste("\n\n <b>", datasets.list[i], "</b> \n\n", sep=" "))
  
  ##### Samples ordering
  if ( params$samples_keep_order ) {
    if ( params$samples_keep_order_annot ) {
      sampleOrder <- mafInfo[[i]]@clinical.data$Tumor_Sample_Barcode
    } else {
      sampleOrder <- unique(mafInfo[[i]]@data$Tumor_Sample_Barcode)
    }
  } else {
    sampleOrder <- NULL
  }
  
  ##### Generate oncoplots for the top mutated genes in each dataset that has > 1 sample with non-synonymous variants
  if ( top_genes_no[[i]] > 1 ) {
    
    cat(paste(top_genes_no[[i]], "genes are mutated in at least", genes_min[[i]], "% of patients\n\n", sep=" "))
    
    ##### Save the plot as PNG
    png( file = paste(outDir, "/MAF_oncoplot_", datasets.list[i], ".png", sep = ""), width = 1800*1.8, height = oncoplot_height[[i]]*1.8, units = "px", res = 300 )
      
    ##### Drawing oncoplots for the top mutated genes in each dataset
    plot.new()
    par(mar=c(1,4,2,0.5), oma=c(1.5,2,2,1))
    maftools::oncoplot(maf = mafInfo[[i]], top = top_genes_no[[i]], fontSize = 0.7, colbar_pathway = FALSE, removeNonMutated = FALSE, draw_titv = params$draw_titv, clinicalFeatures = clinicalFeatures[[i]], annotationColor = clinicalFeatures_cols[[i]], colors = vc_cols, includeColBarCN = FALSE, showTumorSampleBarcodes = params$samples_show, barcode_mar = 5, keepGeneOrder = params$genes_keep_order, sampleOrder = sampleOrder, sortByAnnotation = params$sort_by_annotation)
    while (!is.null(dev.list()))  dev.off()
    
    ##### Read in the oncoplots PNG files
    cat("![](",paste(outDir, "/MAF_oncoplot_", datasets.list[i], ".png", sep = ""),")")
    cat("<br/>")
    
    ##### Save the oncoplot matrix into a file
    genes = getGeneSummary(x = mafInfo[[i]])[1:top_genes_no[[i]], Hugo_Symbol]
    om = createOncoMatrix(m = mafInfo[[i]], g = genes)
    om$oncoMatrix[ om$oncoMatrix == "" ] <- "-"
    
    ##### Samples ordering
    if ( params$samples_keep_order ) {
      om$oncoMatrix <- om$oncoMatrix[ ,as.vector(sampleOrder) ]
    }
    
    ##### Write gene summary into a file
    ##### Create a new workbook
    wb <- createWorkbook(paste("MAF_oncoplot_", datasets.list[i], ".xlsx", sep = ""))

    ##### Add worksheet for oncoMatrix
    addWorksheet(wb, substring("oncoMatrix", 0, 31))
    writeData(wb, rowNames=TRUE, sheet = 1, om$oncoMatrix)
    
    ##### Add worksheet for numericMatrix
    addWorksheet(wb, substring("numericMatrix", 0, 31))
    writeData(wb, rowNames=TRUE, sheet = 2, om$numericMatrix)
    
    saveWorkbook(wb, paste(outDir, "/MAF_oncoplot_", datasets.list[i], ".xlsx", sep = ""), overwrite = TRUE)

  } else {
      cat(paste0("Less than 2 genes are mutated in at least ", genes_min[[i]], "% of patients.\n\n\n"))
  }
}
```

`r if ( goi_status ) { c("***") }`

`r if ( goi_status ) { c("##### Genes of interest") }`

`r if ( goi_status ) { c("Oncoplot(s) illustrating different types of mutations observed across samples for the **genes of interest**. The side and top bar-plots present the frequency of mutations in these genes and in individual samples, respectively.") }`

`r if ( goi_status ) { c("<span style=\"color:#ff0000\">NOTE</span>, the top column bar illustrates the total number of alterations detected only in the genes of interest.") }`

`r if ( params$purple != "none" ) { c("<span style=\"color:#ff0000\">NOTE</span>, copy-number (CN) alterations are derived from *[PURPLE](https://github.com/hartwigmedical/hmftools/tree/master/purity-ploidy-estimator){target=\"_blank\"}* programme.") }`

`r if ( params$purple != "none" ) { c("<details>") }`
`r if ( params$purple != "none" ) { c("<summary>CNV definitions</summary>") }`
`r if ( params$purple != "none" ) { c("<font size=\"2\">") }`
`r if ( params$purple != "none" ) { paste0("HD: CN < ", params$purple_hd, "\n") }`
`r if ( params$purple != "none" ) { paste0("LOH: CN >= ", params$purple_hd, " & < ", params$purple_loh, "\n") }`
`r if ( params$purple != "none" ) { c("AMP: CN >= ", params$purple_amp) }`
`r if ( params$purple != "none" ) { c("</font>") }`
`r if ( params$purple != "none" ) { c("</details>") }`
`r if ( params$purple != "none" ) { c("</br>") }`

`r if ( params$gistic != "none" ) { c("<span style=\"color:#ff0000\">NOTE</span>, copy-number (CN) alterations are derived from *[GISTIC](http://software.broadinstitute.org/cancer/software/genepattern/modules/docs/GISTIC_2.0){target=\"_blank\"}* programme.") }`

```{r maf_oncoplot_goi, comment = NA, message=FALSE, warning=FALSE, results="asis", eval=goi_status}
##### Draw additional oncoplot if a list of genes of intereset was specified by user 
###### Generate separate plot for each dataset
for ( i in 1:length(mafFiles) ) {
  if ( length(goi[[i]]) > 1 ) {
   
    cat(paste("\n\n <b>", datasets.list[i], "</b>\n\n", sep=" "))
      
    ##### Drawing oncoplots for the top mutated genes in each dataset that has > 1 sample with non-synonymous variants
    if ( length(goi[[i]]) > 1 ) {
      
      ##### Report genes missing in individual MAFs, as well as those which are within the list of genes of interest and are also among the most frequently mutated genes
      cat(paste("* The following genes from provided gene list are missing from", datasets.list[i], "MAF:\n\n", sep=" "))
        
      cat(paste("<i>", goi_absent[[i]], "</i>", collapse = ", "))
        
      cat(paste("\n\n* The following genes from provided gene list are also among the top", top_genes_no[[i]], "most frequently mutated genes:\n\n", sep=" "))
        
      cat(paste("<i>", top_genes_goi[[i]], "</i>", collapse = ", "), "\n\n")
        
      ##### Check if the genes of interest have any non-synonymus mutations
      if ( length(top_genes_goi[[i]]) > 1  ) {
          
        ##### Save the plot as PNG
        png( file = paste(outDir, "/MAF_oncoplot_goi_", datasets.list[i], ".png", sep = ""), width = 1800*1.7, height = oncoplot_goi_height[[i]]*1.7, units = "px", res = 300 )
            
        ##### Drawing oncoplots for the top mutated genes in each dataset
        plot.new()
        par(mar=c(1,4,2,0.5), oma=c(1.5,2,2,1))
        maftools::oncoplot(maf = mafInfo[[i]], genes = goi[[i]], fontSize = 0.7, colbar_pathway = FALSE, removeNonMutated = FALSE, draw_titv = params$draw_titv, clinicalFeatures = clinicalFeatures[[i]], annotationColor = clinicalFeatures_cols[[i]], colors = vc_cols, includeColBarCN = FALSE, showTumorSampleBarcodes = params$samples_show, barcode_mar = 5, keepGeneOrder = params$genes_keep_order, sampleOrder = sampleOrder, sortByAnnotation = params$sort_by_annotation)
        while (!is.null(dev.list()))  dev.off()
          
        ##### Read in the oncoplots PNG files
        cat("![](",paste(outDir, "/MAF_oncoplot_goi_", datasets.list[i], ".png", sep = ""),")")
        cat("<br/>")
          
        ##### Save the oncoplot matrix into a file
        om = createOncoMatrix(m = mafInfo[[i]], g = goi[[i]])
          
        ##### Add "-" to genes with no mutations
        om_genes.missing <- goi[[i]][ goi[[i]] %!in% rownames(om$oncoMatrix) ]
        om.missing <- data.frame(matrix(vector(), length(om_genes.missing), ncol(om$oncoMatrix), dimnames=list(c(), colnames(om$oncoMatrix))), stringsAsFactors=FALSE)
        rownames(om.missing) <- om_genes.missing
        om.missing[ is.na(om.missing) ] <- "-" 
        colnames(om$oncoMatrix) <- make.names(colnames(om$oncoMatrix))
        om$oncoMatrix <- rbind(om$oncoMatrix, om.missing)
        om$oncoMatrix[ om$oncoMatrix == "" ] <- "-"
          
        ##### Genes ordering
        if ( params$genes_keep_order ) {
          om_genes <- rownames(om$oncoMatrix)
          om_genes <- goi[[i]][ goi[[i]] %in% om_genes ]
          om$oncoMatrix <- om$oncoMatrix[ om_genes, ]
        }
          
        ##### Samples ordering
        if ( params$samples_keep_order ) {
          om$oncoMatrix <- om$oncoMatrix[ , make.names(sampleOrder) ]
        }
          
        ##### Write gene summary into a file
        ##### Create a new workbook
        wb <- createWorkbook(paste("MAF_oncoplot_goi_", datasets.list[i], ".xlsx", sep = ""))
      
        ##### Add worksheet for oncoMatrix
        addWorksheet(wb, substring("oncoMatrix", 0, 31))
        writeData(wb, rowNames=TRUE, sheet = 1, om$oncoMatrix)
          
        ##### Add worksheet for numericMatrix
        addWorksheet(wb, substring("numericMatrix", 0, 31))
        writeData(wb, rowNames=TRUE, sheet = 2, om$numericMatrix)
          
        saveWorkbook(wb, paste(outDir, "/MAF_oncoplot_goi_", datasets.list[i], ".xlsx", sep = ""), overwrite = TRUE)
          
      } else if ( length(genes_list_goi.silent[[i]]) > 0  ) {
        cat(paste0("**Less than 2 genes of interest have non-synonymous variants detected and for *", paste(genes_list_goi.silent[[i]], collapse = ", "), "* only silent variants were detected in ", datasets.list[i], " dataset**.\n\n\n"))
          
      } else {
        cat(paste0("**None of the genes of interest have any variants detected in ", datasets.list[i], " dataset**.\n\n\n"))
      }
    } else {
      cat(paste0("All genes of interest are among the top ", top_genes_no[[i]], " most frequently mutated genes (mutated in at least ", genes_min[[i]], "% of patients) or no non-synonymous variants were detected in ", datasets.list[i], " dataset.\n\n\n"))
    }
  }
}
```

`r if ( pathways_status ) { c("***") }`

`r if ( pathways_status ) { c("##### Pathways") }`

`r if ( pathways_status ) { c("Oncoplot(s) illustrating different types of mutations observed across samples for genes involved in the **pathways of interest**. The side and top bar-plots present the frequency of mutations in these genes and in individual samples, respectively.") }`

`r if ( params$purple != "none" ) { c("<details>") }`
`r if ( params$purple != "none" ) { c("<summary>CNV definitions</summary>") }`
`r if ( params$purple != "none" ) { c("<font size=\"2\">") }`
`r if ( params$purple != "none" ) { paste0("HD: CN < ", params$purple_hd, "\n") }`
`r if ( params$purple != "none" ) { paste0("LOH: CN >= ", params$purple_hd, " & < ", params$purple_loh, "\n") }`
`r if ( params$purple != "none" ) { c("AMP: CN >= ", params$purple_amp) }`
`r if ( params$purple != "none" ) { c("</font>") }`
`r if ( params$purple != "none" ) { c("</details>") }`
`r if ( params$purple != "none" ) { c("</br>") }`

`r if ( params$purple != "none" ) { c("<span style=\"color:#ff0000\">NOTE</span>, copy-number (CN) alterations are derived from *[PURPLE](https://github.com/hartwigmedical/hmftools/tree/master/purity-ploidy-estimator){target=\"_blank\"}* programme.") }`

```{r maf_oncoplot_pathways, comment = NA, message=FALSE, warning=FALSE, results="asis", eval=pathways_status}
##### Save oncoplot into the file
##### Generate separate plot for each dataset
##### Get genes involved in the pathways of interest
pathways_genes <- read.table(params$pathways, sep="\t", as.is=TRUE, header=TRUE, row.names=NULL, quote = "")[,1]

##### Create a list to store MAF pathways info for individual datasets
mafInfo.pathways <- vector("list", length(mafFiles))
names(mafInfo.pathways) <- datasets.list

for ( i in 1:length(mafFiles) ) {
  
  cat(paste("\n\n <b>", datasets.list[i], "</b> \n\n", sep=" "))
  
  ##### Generate MAF object containing only the genes involved in pathways of interest
  mafInfo.pathways[[i]] <- subsetMaf(maf = mafInfo[[i]], includeSyn = FALSE, genes = pathways_genes, mafObj = TRUE)
  
  ##### Save the plot as PNG
  png( file = paste(outDir, "/MAF_oncoplot_pathways_", datasets.list[i], ".png", sep = ""), width = 1800*1.5, height = oncoplot_height[[i]]*1.5+(100*length(params$pathways)), units = "px", res = 300 )
  
  ##### Drawing oncoplots for the top mutated genes in each dataset
  plot.new()
  par(mar=c(1,4,2,0.5), oma=c(1.5,2,2,1))
  maftools::oncoplot(maf = mafInfo.pathways[[i]], fontSize = 0.7, colbar_pathway = FALSE, removeNonMutated = FALSE, draw_titv = params$draw_titv, clinicalFeatures = clinicalFeatures[[i]], annotationColor = clinicalFeatures_cols[[i]], colors = vc_cols, includeColBarCN = FALSE, showTumorSampleBarcodes = params$samples_show, barcode_mar = 5, sampleOrder = sampleOrder, sortByAnnotation = params$sort_by_annotation, pathways = params$pathways, gene_mar = 10)
  while (!is.null(dev.list()))  dev.off()
    
  ##### Read in the oncoplots PNG files
  cat("![](",paste(outDir, "/MAF_oncoplot_pathways_", datasets.list[i], ".png", sep = ""),")")
  cat("<br/>")
}
```


***

#### Transitions/transversions {.tabset .tabset-fade}

##### Plots {.tabset .tabset-fade}

Plots presenting the transitions and transversions distribution in individual dataset(s). In each panel, the box-plots show the *overall distribution* of the six different conversions (*C>A*, *C>G*, *C>T*, *T>C*, *T>A* and *T>G*)(top-left), and the transitions and transversions *frequency* (top-right). The stacked bar-plot (bottom) displays the *fraction* of the six different conversions in each sample.

```{r maf_TiTv_plot, comment = NA, message=FALSE, warning=FALSE, results="asis"}
###### Generate separate plot for each dataset
###### Create empty list for TiTv info from each dataset
titv.info <- list()

for ( i in 1:length(mafFiles) ) {

  cat(paste("\n\n <b>", datasets.list[i], "</b>\n\n", sep=" "))

  ##### Drawing distribution plots of the transitions and transversions
  titv.info[[datasets.list[i]]] <- maftools::titv(maf = mafInfo[[i]], plot = FALSE, useSyn = TRUE)

  try(maftools::plotTiTv(res = titv.info[[datasets.list[i]]]), silent = TRUE)
  mtext("Transition and transversions distribution", outer=TRUE,  cex=1, line=-1.5)
  
  cat("<br/><br/>")
}
```

***

##### Tables {.tabset .tabset-fade}

###### Overall distribution

Table(s) presenting the *overall distribution* of the six different conversions (*C>A*, *C>G*, *C>T*, *T>C*, *T>A* and *T>G*) across all samples in each dataset.

```{r maf_TiTv_tables_xlsx, comment = NA, message=FALSE, warning=FALSE}
##### Write per-sample transitions and transversions distribution into a file
##### Create a new workbook
wb <- createWorkbook("MAF_summary_titv.xlsx")

##### Add worksheets, three for each dataset
for ( i in 1:length(mafFiles) ) {
  
    addWorksheet(wb, substring(paste0(datasets.list[i], " (fraction)"), 0, 31), tabColour = datasets.colour[[1]][i])
    writeData(wb, sheet = i*3-2, titv.info[[datasets.list[i]]]$fraction.contribution)
    
    addWorksheet(wb, substring(paste0(datasets.list[i], " (count)"), 0, 31), tabColour = datasets.colour[[1]][i])
    writeData(wb, sheet = i*3-1, titv.info[[datasets.list[i]]]$raw.counts)
    
    addWorksheet(wb, substring(paste0(datasets.list[i], " (TiTv fractions)"), 0, 31), tabColour = datasets.colour[[1]][i])
    writeData(wb, sheet = i*3, titv.info[[datasets.list[i]]]$TiTv.fractions)
}

saveWorkbook(wb, paste(outDir, "MAF_summary_titv.xlsx", sep="/"), overwrite = TRUE)
```

```{r maf_TiTv_table_overall_dist, comment = NA, message=FALSE, warning=FALSE}
##### Generate tables with the overall distribution of the six different conversions (C>A, C>G, C>T, T>C, T>A and T>G)
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- DT::datatable(data = titv.info[[datasets.list[i]]]$raw.counts, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
        DT::formatStyle( columns = names(titv.info[[datasets.list[i]]]$raw.counts), 'text-align' = 'center' )
}

##### Print a list of htmlwidgets
widges.list

##### Add extra lines to make sure that this section doesn't overlap with the next one
cat("\n\n\n")
```

***

###### Transitions/transversions frequency

Table(s) presenting the transitions and transversions *frequency* across all samples in each dataset.

```{r maf_TiTv_table_frequency, comment = NA, message=FALSE, warning=FALSE}
##### Generate tables with transitions and transversions frequency in individual datasets
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- DT::datatable(data = titv.info[[datasets.list[i]]]$TiTv.fractions, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
        DT::formatStyle( columns = names(titv.info[[datasets.list[i]]]$TiTv.fractions), 'text-align' = 'center' ) %>%
  formatRound(columns = c("Ti", "Tv"), 1)
}

##### Print a list of htmlwidgets
widges.list

##### Add extra lines to make sure that this section doesn't overlap with the next one
cat("\n\n\n")
```

***

###### Transitions/transversions fraction

Table(s) presenting the *fraction* of the six different conversions in each sample in individual dataset(s).

```{r maf_TiTv_table_fraction, comment = NA, message=FALSE, warning=FALSE}
##### Generate tables with transitions and transversions frequency in individual datasets
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- DT::datatable(data = titv.info[[datasets.list[i]]]$fraction.contribution, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
        DT::formatStyle( columns = names(titv.info[[datasets.list[i]]]$fraction.contribution), 'text-align' = 'center' ) %>%
  formatRound(columns = names(titv.info[[datasets.list[i]]]$fraction.contribution)[2:7], 1)
}

##### Print a list of htmlwidgets
widges.list

##### Add extra lines to make sure that this section doesn't overlap with the next one
cat("\n\n\n")
```

***

#### Cross-cancers comparison {.tabset .tabset-fade}

##### Plot

Plot(s) illustrating the mutation load in investigated dataset(s) along with distribution of variants compiled from over 10,000 whole-exome sequencing samples across 33 [TCGA](https://cancergenome.nih.gov/){target="_blank"} landmark cohorts. Every dot represents a sample whereas the red horizontal lines are the median numbers of mutations in the respective cancer types. The vertical axis (log scaled) shows the number of mutations per megabase whereas the different cancer types are ordered on the horizontal axis based on their median numbers of somatic mutations. This plot is similar to the one described in the paper [Signatures of mutational processes in human cancer](https://www.ncbi.nlm.nih.gov/pubmed/23945592){target="_blank"} by Alexandrov *et al*.

```{r maf_tcga_cohorts, comment = NA, message=FALSE, warning=FALSE, fig.width = 8, fig.height = 6, results="asis"}
###### Generate separate plot for each dataset
for ( i in 1:length(mafFiles) ) {
  
  cat(paste("\n\n <b>", datasets.list[i], "</b>\n\n", sep=" "))
  
  ##### Compare mutation load against TCGA cohorts
  if ( gisticInfo[[i]]$status ) {
    par(mar=c(1,1,1,1), oma=c(6,2,2,1))
    tcgaCompare.res <- tcgaCompare(maf = mafInfo_tcgaCompare[[i]], cohortName = datasets.list[i], primarySite=TRUE)$median_mutation_burden
  } else if ( params$purple != "none" ) {
    par(mar=c(1,1,1,1), oma=c(6,2,2,1))
    tcgaCompare.res <- tcgaCompare(maf = mafInfo_tcgaCompare[[i]], cohortName = datasets.list[i], primarySite=TRUE)$median_mutation_burden
  } else if ( params$cnvkit != "none" ) {
    par(mar=c(1,1,1,1), oma=c(6,2,2,1))
    tcgaCompare.res <- tcgaCompare(maf = mafInfo_tcgaCompare[[i]], cohortName = datasets.list[i], primarySite=TRUE)$median_mutation_burden
  } else {
    par(mar=c(1,1,1,1), oma=c(6,2,2,1))
    tcgaCompare.res <- tcgaCompare(maf = mafInfo[[i]], cohortName = datasets.list[i], primarySite=TRUE)$median_mutation_burden
  }
  
  cat("<br/><br/>")
}
```

***

##### Table

Tables(s) with median mutation load across the 33 [TCGA](https://cancergenome.nih.gov/){target="_blank"} landmark cohorts as well as the investigated dataset(s).

```{r maf_tcga_cohorts_table, comment = NA, message=FALSE, warning=FALSE}
##### Present a sample table in the html report
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- DT::datatable( data = tcgaCompare.res, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
      DT::formatStyle( columns = names(tcgaCompare.res), 'text-align' = 'center' ) %>%
      DT::formatRound( columns = names(tcgaCompare.res)[length(names(tcgaCompare.res))], 2)
}
  
##### Print a list of htmlwidgets
widges.list
```

***

#### Oncogenic signaling pathways {.tabset .tabset-fade}

Enrichment of known oncogenic signaling pathways reported in [TCGA](https://cancergenome.nih.gov/){target="_blank"} cohorts (see paper [Oncogenic Signaling Pathways in The Cancer Genome Atlas](https://www.ncbi.nlm.nih.gov/pubmed/29625050){target="_blank"}).

##### Plots {.tabset .tabset-fade}

```{r oncogenic_pathways_plots, echo=FALSE, comment = NA, message=FALSE, warning=FALSE, results="asis"}
###### Generate separate plot for each dataset
OncogenicPathways.res <- list()
  
for ( i in 1:length(mafFiles) ) {
  
  cat(paste("\n\n <b>", datasets.list[i], "</b>\n\n", sep=" "))
  
  OncogenicPathways.res[[i]] <- capture.output(OncogenicPathways(maf = mafInfo[[i]]))
  
  if ( length(OncogenicPathways.res[[i]]) > 0 ) {
    
    cat("\n\n\n\nThe complete oncogenic pathways are presented below. Tumour suppressor genes are in <span style=\"color:#ff0000\">red</span>, and oncogenes are in <span style=\"color:#0000ff\">blue</span> font.\n")
  
    for( j in (length(OncogenicPathways.res[[i]])/2):2 ){
      
      pathway <- unlist(strsplit(as.character(OncogenicPathways.res[[i]][j]), split=' ', fixed=TRUE))
      pathway <- pathway[ pathway != "" ]
      
      if ( pathway[4] != 0  ) {
        
        cat("\n###### ",as.character(pathway[2]), "\n")
        
        maftools::plotPathways(maf = mafInfo[[i]], 
                               pathlist = maftools::pathways(maf = mafInfo[[i]]), 
                               showTumorSampleBarcodes = params$samples_show)
        
        cat("\n\n***\n")
      }
    }
    } else {
    cat("\n\n\n\nNone of the known oncogenic signaling pathways is enriched.\n\n")
  }
}
```

##### Table

```{r oncogenic_pathways_table, comment = NA, message=FALSE, warning=FALSE, eval=FALSE}
##### Present a sample table in the html report
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  
  OncogenicPathways.table <- NULL
  
  if ( length(OncogenicPathways.res[[i]]) > 0 ) {
  
  ##### Prepare table 
  for( j in 2:(length(OncogenicPathways.res[[i]])/2) ){
    
    
    pathway <- unlist(strsplit(as.character(OncogenicPathways.res[[i]][j]), split=' ', fixed=TRUE))
    pathway <- pathway[ pathway != "" ]
    
    OncogenicPathways.table <- rbind( OncogenicPathways.table, pathway[-1]  )
  }
  
  colnames(OncogenicPathways.table) <- c("Pathway", "No of genes", "No of affected genes", "Fraction affected", "Mutated samples")
  OncogenicPathways.table <- OncogenicPathways.table[ order(OncogenicPathways.table[, "Mutated samples"], decreasing = TRUE), ]
  
  widges.list[[i]] <- DT::datatable( data = OncogenicPathways.table, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
      DT::formatStyle( columns = colnames(OncogenicPathways.table), 'text-align' = 'center' ) %>%
      DT::formatRound( columns = "Fraction affected", 2)
  } else {
    cat("\n\n\n\nNone of the known oncogenic signaling pathways is enriched.\n\n")
  }
}
  
##### Print a list of htmlwidgets
widges.list
```

***

`r if ( params$clinical_info != "none" ) { c("#### Clinical enrichment") }`

`r if ( params$clinical_info != "none" ) { c("Plot(s) illustrating enrichment of individual mutations in every category within provided clinical features.") }`

```{r clinical_enrichment, comment = NA, message=FALSE, warning=FALSE, fig.width = 8, fig.height = 4, results="asis"}
##### Generate separate plot for each dataset
##### Create a list to store clinical enrichment analysis results info for individual datasets
clinical.enrichment <- vector("list", length(mafFiles))
names(clinical.enrichment) <- datasets.list

if ( params$clinical_info != "none" ) {
  for ( i in 1:length(mafFiles) ) {
    
    cat(paste("\n\n <b>", datasets.list[i], "</b> \n\n", sep=" "))
    
    ##### First make sure that there are >1 class of tested annotation fields
    for ( j in 1:length(clinicalFeatures[[i]]) )  {
      if ( length(unique(as.data.frame(mafInfo[[i]]@clinical.data)[, names(mafInfo[[i]]@clinical.data) %in% clinicalFeatures[[i]][j]])) < 2  ) {
        clinicalFeatures2rm <- clinicalFeatures[[i]][j]
      } else {
        clinicalFeatures2rm <- NULL
      }
    } 
    clinicalFeatures[[i]] <- clinicalFeatures[[i]][ clinicalFeatures[[i]] %!in% clinicalFeatures2rm ]
    
    ##### Make sure that there is at least one clinical feature assigned to > 1 sample
    if ( length(as.data.frame(mafInfo[[i]]@clinical.data)[ ,clinicalFeatures[[i]]]) < nrow(mafInfo[[i]]@clinical.data) ) {
      clinical.enrichment[[i]] = quiet(clinicalEnrichment(maf = mafInfo[[i]], clinicalFeature = clinicalFeatures[[i]]))
    } else {
      clinical.enrichment[[i]] <- NA
    }
    
    ##### Check any gene in given dataset passed the user-defined threshold
    if ( is.na(clinical.enrichment[[i]]) ) {
      cat(paste0("No significant associations found at p-value < ", params$clinical_enrichment_p))
      cat("<br/><br/>")
    } else if ( any(clinical.enrichment[[i]]$groupwise_comparision$p_value < params$clinical_enrichment_p) ) {
      try(plotEnrichmentResults(enrich_res = clinical.enrichment[[i]], pVal = params$clinical_enrichment_p), silent = TRUE)
      cat("<br/><br/>")
    } else {
      cat(paste0("No significant associations found at p-value < ", params$clinical_enrichment_p))
      cat("<br/><br/>")
    }
  }
  cat("\n\n***\n")
}
```

***

## Signature analysis {.tabset .tabset-fade}

Every cancer, as it progresses leaves a signature characterised by specific pattern of nucleotide substitutions (see paper [Signatures of mutational processes in human cancer](https://www.ncbi.nlm.nih.gov/pubmed/23945592){target="_blank"} by Alexandrov *et al*). Such signatures are extracted by decomposing matrix of nucleotide substitutions, classified into 96 substitution classes based on immediate bases surrounding the mutated base. The optimal number of detected singatures is defined by running *[non-negative matrix factorization](https://cran.r-project.org/web/packages/NMF/index.html){target="_blank"}* (NMF) function on a range of values and comparing individual *cophenetic correlation coefficients*.

```{r mutational_sign, comment = NA, message=FALSE, warning=FALSE, results="hide"}
##### Create a list to store signature analysis results info for individual datasets
mutSign <- vector("list", length(mafFiles))
names(mutSign) <- datasets.list

##### Run signature analysis for each dataset
for ( i in 1:length(mafFiles) ) {
  
  ##### Check if "chr" prefix needs to be added
  if ( grepl("chr", mafInfo[[i]]@data$Chromosome[1], fixed = TRUE) ) {
    prefix = NULL
  } else {
    prefix = "chr"
  }
  
  mutSign[[i]] = trinucleotideMatrix(maf = mafInfo[[i]], prefix = prefix, add = TRUE, ref_genome = paste0("BSgenome.Hsapiens.UCSC.hg",params$ucsc_genome_assembly))
}
```

### Mutational signatures {.tabset .tabset-fade}

Extracted signatures were compared against known signatures derived from [Alexandrov *et al*](https://www.ncbi.nlm.nih.gov/pubmed/23945592){target="_blank"}, and cosine similarity is calculated to identify best match. The signatures were also compared against known and [validated signatures from COSMIC](https://cancer.sanger.ac.uk/cosmic/signatures){target="_blank"} (expand the "Comparison to all COSMIC signatures" to see comparison to all [COSMIC](https://cancer.sanger.ac.uk/cosmic/signatures){target="_blank"} signatures).

#### Plot

Mutational signatures presented for individual dataset(s).<span style=\"color:#ff0000\">NOTE</span>, 5 best fit signatures are assumed.

```{r signature_analysis, comment = NA, message=FALSE, warning=FALSE, results="asis"}
##### Generate separate plot for each dataset
##### Create a list to store signature analysis results info for individual datasets
mutextractSign <- vector("list", length(mafFiles))
names(mutextractSign) <- datasets.list

##### Run signature analysis for each dataset
for ( i in 1:length(mafFiles) ) {
  
  cat(paste("\n\n <b>", datasets.list[i], "</b>\n\n", sep=" "))
  
  ##### Save the plot as PNG. NOTE: A small positive value (pConstant=0.1) is added to the matrix to avoid "non-conformable arrays" error.
  png( file = paste(outDir, "/MAF_NMF_cophenetic_metric_", datasets.list[i], ".png", sep = ""), width = 1200, height = 800, units = "px", res = 200 )
  invisible(capture.output(mutextractSign[[i]] <- extractSignatures(mat = mutSign[[i]], n = 6, plotBestFitRes = FALSE, pConstant = 0.1)))
  while (!is.null(dev.list()))  dev.off()
  
  ##### Plot detected signatures
  png( file = paste(outDir, "/MAF_mutational_signatures_", datasets.list[i], ".png", sep = ""), width = 1200, height = nrow(mutextractSign[[i]]$contributions)*300, units = "px", res = 200 )
  try(plotSignatures(mutextractSign[[i]], title_size = 0.8), silent = TRUE)
  invisible(dev.off())
  
  cat("![](",paste(outDir, "/MAF_mutational_signatures_", datasets.list[i], ".png", sep = ""),")\n")
  
  ##### Drawing heatmap comparison to all COSMIC signatures
  ##### Save the plot as PNG
  cosm = compareSignatures(nmfRes = mutextractSign[[i]], sig_db = "SBS", verbose = FALSE)

  par(mar=c(2,4,2,0.5), oma=c(1.5,2,2,1))
  invisible(capture.output(pheatmap::pheatmap(mat = cosm$cosine_similarities, cluster_rows = FALSE, main = "Cosine similarity against validated signatures", cellheight = 10, filename = paste(outDir, "/MAF_COSMIC_signatures_comp_", datasets.list[i], ".png", sep = ""))))
  
  cat("\n<details>\n")
  cat("\n<summary>Plot legend</summary>\n")
  cat("<br/>")
  cat("Detected signatures are displayed according to the 96 substitution classification defined by the substitution class and sequence context immediately 3\' and 5\' to the mutated base. The probability bars for the six types of substitutions are displayed in different *colours*. The mutation types are on the *horizontal axes*, whereas *vertical axes* depict the percentage of mutations attributed to a specific mutation type.")
  cat("\n</details>\n")
  cat("<br />")
  cat("\n<details>\n")
  cat("\n<summary>Comparison to all COSMIC signatures</summary>\n")
  cat("<br/>")
  cat("![](",paste(outDir, "/MAF_COSMIC_signatures_comp_", datasets.list[i], ".png", sep = ""),")\n")
  cat("<br />")
  cat("\n</details>\n")
}
```

***

#### Table

Table(s) with detected signature contributions (*columns*) in each sample (*rows*) in individual dataset(s).

```{r signature_analysis_table, comment = NA, message=FALSE, warning=FALSE}
##### Present a sample table in the html report
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- DT::datatable( data = t(round(mutextractSign[[i]]$contributions, digits=2)), caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
      DT::formatStyle( columns = rownames(mutextractSign[[i]]$contributions), 'text-align' = 'center' )
}
  
##### Print a list of htmlwidgets
widges.list
```

***

### Enrichment analysis {.tabset .tabset-fade}

Signatures were assigned to samples and enrichment analysis were performed to detect mutations enriched in every signature identified.

#### Per-signature

Plot(s) illustrating exposures and mutation load of detected mutational signatures in individual dataset(s).

```{r signature_enrichment, comment = NA, message=FALSE, warning=FALSE, fig.width = 8, fig.height = 4, results="asis", eval=FALSE}
##### Generate separate plot for each dataset
##### Create a list to store signature analysis results info for individual datasets
sign.enrichment <- vector("list", length(mafFiles))
names(sign.enrichment) <- datasets.list

for ( i in 1:length(mafFiles) ) {
  cat(paste("\n\n <b>", datasets.list[i], "</b>\n\n", sep=" "))
  
  invisible(capture.output(sign.enrichment[[i]] <- signatureEnrichment(maf = mafInfo[[i]], sig_res = mutextractSign[[i]], minMut = 1)))
  cat("<br/><br/>")
}
```

***

#### Per-genes

Plot(s) illustrating enrichment of individual genes in detected mutational signatures.

```{r signature_enrichment_genes, comment = NA, message=FALSE, warning=FALSE, fig.width = 8, fig.height = 4, results="asis", eval=FALSE}
##### Generate separate plot for each dataset
for ( i in 1:length(mafFiles) ) {
  cat(paste("\n\n <b>", datasets.list[i], "</b>\n\n", sep=" "))
  
  ##### Check any gene in given dataset passed the user-defined threshold
  if ( any(sign.enrichment[[i]]$groupwise_comparision$p_value < params$signature_enrichment_p) ) {
    plotEnrichmentResults(enrich_res = sign.enrichment[[i]], pVal = params$signature_enrichment_p)
    cat("<br/><br/>")
  } else {
    cat(paste0("**No significant associations found at p-value < ", params$signature_enrichment_p, "**"))
    cat("<br/><br/>")
  }
}
```

***

#### Per-sample

Table(s) with signatures (*columns*) assignment to samples (*rows*) in individual dataset(s).

```{r signature_enrichment_samples, comment = NA, message=FALSE, warning=FALSE, eval=FALSE}
##### Present a sample table in the html report
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- DT::datatable( data = sign.enrichment[[i]]$Signature_Assignment, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
      DT::formatStyle( columns = names(sign.enrichment[[i]]$Signature_Assignment), 'text-align' = 'center' )
}
  
##### Print a list of htmlwidgets
widges.list
```

***

### APOBEC signature

APOBEC induced mutations are more frequent in solid tumors and are mainly associated with *C>T* transition events occurring in *TCW motif*. APOBEC enrichment scores were estimated using method described by [Roberts *et al*](https://www.ncbi.nlm.nih.gov/pubmed/23852170){target="_blank"} (see "APOBEC enrichment scores" below). Samples in individual dataset(s) were classified into *APOBEC enriched* and *non-APOBEC enriched* groups, and the differences in mutational patterns between them were analysed to identify differentially altered genes.

<details>
<summary>APOBEC enrichment scores</summary>
<font size="2">

Enrichment of *C>T* mutations occurring within *TCW motif* over all of the *C>T* mutations in a given sample was compared to background *cytosines* and *TCWs* occurring within 20bp of mutated bases.

</font>
</details>

```{r APOBEC_enrichment, comment = NA, message=FALSE, warning=FALSE, results="asis"}
##### Generate separate plot for each dataset
for ( i in 1:length(mafFiles) ) {
  cat("<br/>")
  cat(paste("\n\n <b>", datasets.list[i], "</b> \n\n", sep=" "))
  
  ##### Check if any of the samples are enriched for APOBEC
  sub.tbl <- mutSign[[i]]$APOBEC_scores
  sub.tbl$APOBEC_Enriched = factor(sub.tbl$APOBEC_Enriched, levels = c("yes", "no"))
  
  if ( nrow(sub.tbl[!is.na(APOBEC_Enriched), mean(fraction_APOBEC_mutations), 
        APOBEC_Enriched][APOBEC_Enriched %in% "yes"]) == 0 ) {
    cat("**None of the samples are enriched for APOBEC.**")
    cat("<br/><br/><br/>")
  } else {
    
    ##### Check if there are any differetially mutated genes
    apobec.maf = subsetMaf(maf = mafInfo[[i]], tsb = as.character(sub.tbl[APOBEC_Enriched %in% 
        "yes", Tumor_Sample_Barcode]), mafObj = TRUE, dropLevels = FALSE)
    nonapobec.maf = subsetMaf(maf = mafInfo[[i]], tsb = as.character(sub.tbl[APOBEC_Enriched %in% "no", Tumor_Sample_Barcode]), mafObj = TRUE, dropLevels = FALSE)
    mc = mafCompare(m1 = apobec.maf, m2 = nonapobec.maf, m1Name = "Enriched", m2Name = "nonEnriched", minMut = 2, useCNV = TRUE)
    
    if ( nrow(mc$results[pval < 0.05] ) == 0 ) {
      cat("**No differetially mutated genes found.**")
      cat("<br/><br/><br/>")
    } else {
      try(plotApobecDiff(tnm = mutSign[[i]], maf = mafInfo[[i]], pVal = 0.05), silent = TRUE)
      cat("<br/><br/><br/>")
    }
  }
}
```

***

## Somatic interactions {.tabset .tabset-fade}

Many disease causing genes in cancer are co-occurring or show strong exclusiveness in their mutation pattern. These **mutually exclusive** or **co-occurring set of genes** were detected using *pair-wise Fisher’s Exact test*. <span style=\"color:#ff0000\">NOTE</span>, only top 25 most frequently mutated genes were considered in the analysis.

### Pair-wise plot {.tabset}

Heatmap of observed pairwise mutation patterns. <span style="color:#48d47d">Green</span> colours denote preferential co-mutation, while <span style="color:#b37240">brown</span> colours indicate mutual exclusivity. 

#### Recurrently mutated genes

```{r som_interactions_plot, comment = NA, message=FALSE, warning=FALSE, results="asis"}
##### Run Somatic Interactions analysis for each dataset
##### Create a list to store signature analysis results info for individual datasets
somInter <- vector("list", length(mafFiles))
names(somInter) <- datasets.list

for ( i in 1:length(mafFiles) ) {
  
  cat(paste("\n\n <b>", datasets.list[i], "</b> \n\n", sep=" "))
  
  ##### Exclusive/co-occurance event analysis on top 25 mutated genes
  if ( nrow(mafInfo[[i]]@data) > 1 ) {
    somInter[[i]] <- maftools::somaticInteractions(maf = mafInfo[[i]], top = 25, pvalue = c(0.05, 0.1), returnAll = TRUE)
    
  } else {
    cat(paste0("**Less than 2 genes of interest have non-synonymous variants detected in ", datasets.list[i], " dataset**.\n\n\n"))
  }
  cat("<br/><br/>")
}
```

`r if ( goi_status ) { c("***") }`

`r if ( goi_status ) { c("#### Genes of interest") }`

```{r som_interactions_plot_goi, comment = NA, message=FALSE, warning=FALSE, results="asis", eval=goi_status}
##### Run Somatic Interactions analysis for each dataset
##### Create a list to store signature analysis results info for individual datasets
somInter_goi <- vector("list", length(mafFiles))
names(somInter_goi) <- datasets.list

for ( i in 1:length(mafFiles) ) {
  
  cat(paste("\n\n <b>", datasets.list[i], "</b> \n\n", sep=" "))
  
  ##### Exclusive/co-occurance event analysis on top 25 mutated genes
  if ( length(genes_list_goi.nonsyn[[i]]) > 1 && length(genes_list_goi.nonsyn[[i]]) < 26 ) {
      somInter_goi[[i]] <- maftools::somaticInteractions(maf = mafInfo[[i]], genes = genes_list_goi.nonsyn[[i]], pvalue = c(0.05, 0.1), returnAll = TRUE)
      cat("<br/><br/>")
      
  } else if ( length(genes_list_goi.nonsyn[[i]]) > 25 ) {
      
    cat(paste0("<span style=\"color:#ff0000\">NOTE</span>: ", length(genes_list_goi.nonsyn[[i]]), " genes were provided for ", datasets.list[i], " dataset but only the first **25 genes** were analysed.\n\n"))
       
      somInter_goi[[i]] <- maftools::somaticInteractions(maf = mafInfo[[i]], genes = genes_list_goi.nonsyn[[i]][c(1:25)], pvalue = c(0.05, 0.1), returnAll = TRUE)
       
    } else {
      cat(paste0("**Less than 2 genes of interest have non-synonymous variants detected in ", datasets.list[i], " dataset**.\n\n\n"))
    }
}
```

***

### Pair-wise table {.tabset}

Table with *pair-wise Fisher’s Exact test* results indicating **mutual exclusivity** and **co-occurrance** of genes in individual dataset(s). 

#### Recurrently mutated genes

```{r som_interactions_table, comment = NA, message=FALSE, warning=FALSE, results="asis"}
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  
  somInter.table <- as.data.frame(somInter[[i]])[, colnames(somInter[[i]]) %!in% c("00","11","01","10") ]
  
  widges.list[[i]] <- DT::datatable( data = somInter.table, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
      DT::formatStyle( columns = names(somInter.table), 'text-align' = 'center' ) %>%
      DT::formatRound(columns = c("pValue", "oddsRatio"), 1)
}
  
##### Print a list of htmlwidgets
widges.list
```

`r if ( goi_status ) { c("***") }`

`r if ( goi_status ) { c("#### Genes of interest") }`

```{r som_interactions_table_goi, comment = NA, message=FALSE, warning=FALSE, results="asis", eval=goi_status}
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
    
  cat(paste("\n\n <b>", datasets.list[i], "</b> \n\n", sep=" "))
    
  somInter.table <- as.data.frame(somInter_goi[[i]])[, colnames(somInter_goi[[i]]) %!in% c("00","11","01","10") ]
    
  if ( length(genes_list_goi.nonsyn[[i]]) > 1 ) {
    if ( length(genes_list_goi.nonsyn[[i]]) > 25 ) {
      cat(paste0("<span style=\"color:#ff0000\">NOTE</span>: ", length(genes_list_goi.nonsyn[[i]]), " genes were provided for ", datasets.list[i], " dataset but only the first **25 genes** were analysed.\n\n"))
    }
      
    widges.list[[i]] <- DT::datatable( data = somInter.table, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
        DT::formatStyle( columns = names(somInter.table), 'text-align' = 'center' ) %>%
        DT::formatRound(columns = c("pValue", "oddsRatio"), 1)
      
  } else {
      cat(paste0("**Less than 2 genes of interest have non-synonymous variants detected in ", datasets.list[i], " dataset**.\n\n\n"))
  }
}
##### Print a list of htmlwidgets
widges.list
```

***

### Oncostrip {.tabset}

[Oncoplot]s illustrating **mutually exclusive** or **co-occurring** set of **genes** (*P-value* < 0.05) in individual dataset(s).

#### Recurrently mutated genes

```{r som_interactions_oncostrip, comment = NA, message=FALSE, warning=FALSE, results="asis"}
##### Generate oncostrip plot for each dataset
for ( i in 1:length(mafFiles) ) {
  
  cat(paste("\n\n <b>", datasets.list[i], "</b> \n\n", sep=" "))
  
  genes <- unique( unlist(c(as.data.frame(somInter[[i]])[ somInter[[i]]$pValue < 0.05, "gene1" ], as.data.frame(somInter[[i]])[ somInter[[i]]$pValue < 0.05, "gene2" ])) )
  
  if ( length(genes) > 0 ) {
    maftools::oncostrip(maf = mafInfo[[i]], genes = genes, top = NULL, fontSize = 0.7, removeNonMutated = FALSE, draw_titv = params$draw_titv, clinicalFeatures = clinicalFeatures[[i]], sortByAnnotation = params$sort_by_annotation, colors = vc_cols, showTumorSampleBarcodes = params$samples_show, barcode_mar = 5, gene_mar = 6)
  } else {
    cat(paste0("**None of detected gene pairs have P-value < 0.05**.\n\n\n"))
  }
  cat("<br/><br/>")
}
```

`r if ( goi_status ) { c("***") }`

`r if ( goi_status ) { c("#### Genes of interest") }`

```{r som_interactions_oncostrip_goi, comment = NA, message=FALSE, warning=FALSE, results="asis", eval=goi_status}
##### Generate oncostrip plot for each dataset
for ( i in 1:length(mafFiles) ) {
    
  cat(paste("\n\n <b>", datasets.list[i], "</b> \n\n", sep=" "))
    
  if ( length(genes_list_goi.nonsyn[[i]]) > 1 ) {
    if ( length(genes_list_goi.nonsyn[[i]]) > 25 ) {
        
      cat(paste0("<span style=\"color:#ff0000\">NOTE</span>: ", length(genes_list_goi.nonsyn[[i]]), " genes were provided for ", datasets.list[i], " dataset but only the first **25 genes** were analysed.\n\n"))
        
      genes <- unique( unlist(c(as.data.frame(somInter_goi[[i]])[ somInter_goi[[i]]$pValue < 0.05, "gene1" ], as.data.frame(somInter_goi[[i]])[ somInter_goi[[i]]$pValue < 0.05, "gene2" ])) )
        
      if ( length(genes) > 0 ) {
        maftools::oncostrip(maf = mafInfo[[i]], genes = genes, top = NULL, fontSize = 0.7, removeNonMutated = FALSE, draw_titv = params$draw_titv, clinicalFeatures = clinicalFeatures[[i]], sortByAnnotation = params$sort_by_annotation, colors = vc_cols, showTumorSampleBarcodes = params$samples_show, barcode_mar = 5, gene_mar = 6)
      } else {
        cat(paste0("**None of detected gene pairs have P-value < 0.05**.\n\n\n"))
      }
      cat("<br/><br/>")
    } else {
        
      genes <- unique( unlist(c(as.data.frame(somInter_goi[[i]])[ , "gene1" ], as.data.frame(somInter_goi[[i]])[ , "gene2" ])) )
      if ( length(genes) > 0 ) {
        maftools::oncostrip(maf = mafInfo[[i]], genes = genes, top = NULL, fontSize = 0.7, removeNonMutated = FALSE, draw_titv = params$draw_titv, clinicalFeatures = clinicalFeatures[[i]], sortByAnnotation = params$sort_by_annotation)
      } else {
        cat(paste0("**None of detected gene pairs have P-value < 0.05**.\n\n\n"))
      }
      cat("<br/><br/>")
    }
  } else {
      cat(paste0("**Less than 2 genes of interest have non-synonymous variants detected in ", datasets.list[i], " dataset**.\n\n\n"))
  }
}
```

`r if ( runGistic ) { c("\n***\n## CN alterations {.tabset .tabset-fade}") }`

`r if ( runGistic ) { c("Summary of copy-number (CN) alterations data from  *[GISTIC](http://software.broadinstitute.org/cancer/software/genepattern/modules/docs/GISTIC_2.0){target=\"_blank\"}* programme.") }`

`r if ( runGistic ) { c("### Plots {.tabset .tabset-fade}") }`

`r if ( runGistic ) { c("#### Genome plot") }`

`r if ( runGistic ) { c("Genome plot(s) with segments highlighting significant (*q-value* < 0.1) amplification (<span style=\"color:#ff0000\">red</span> bars) and deletion (<span style=\"color:#0000ff\">blue</span> bars) regions. *Y-axis* shows the *G-scores* that consider the amplitude of the aberration as well as the frequency of its occurrence across samples.") }`

```{r gistic_genome_plot, comment = NA, message=FALSE, warning=FALSE, results="asis", eval = runGistic }
for ( i in 1:length(gisticFiles) ) {
  if ( gisticInfo[[i]]$status ) {
    
    ##### Save the genome plot as PNG for each dataset
    png( file = paste(outDir, "/MAF_gisticGenomePlot_", datasets.list[i], ".png", sep = ""), width = 1800, height = 1200, units = "px", res = 200 )
      
    ##### Drawing oncoplot for each dataset
    plot.new()
    par(mar=c(4,4,2,0.5), oma=c(1.5,2,2,1))
    maftools::gisticChromPlot(gistic = gisticInfo[[i]]$summary, markBands = "all")
    while (!is.null(dev.list()))  dev.off()
    
    ##### Read in the genome plots PNG files
    cat("![](",paste(outDir, "/MAF_gisticGenomePlot_", datasets.list[i], ".png", sep = ""),")")
    cat("<br/><br/><br/>")
  } else {
   cat(paste("Information about copy-number alterations for", datasets.list[i], "is not available\n\n", sep=" "))
  }
}
```

`r if ( runGistic ) { c("\n***\n#### Bubble plot") }`

`r if ( runGistic ) { c("Plot(s) presenting significantly altered cytobands as a function of number samples (*x-axis*) in which it is altered and number genes (*y-axis*) it contains. Size of each bubble is according to *-log10* transformed *q-values* (FDR). Significantly amplified and deleted cytobands are presented as <span style=\"color:#ff0000\">red</span> and <span style=\"color:#0000ff\">blue</span> bubbles, respectively.") }`

```{r gistic_bubble_plot, comment = NA, message=FALSE, warning=FALSE, results="asis", eval = runGistic }
for ( i in 1:length(gisticFiles) ) {
  if ( gisticInfo[[i]]$status ) {
    
    ##### Save the bubble plot as PNG for each dataset
    png( file = paste(outDir, "/MAF_gisticBubblePlot_", datasets.list[i], ".png", sep = ""), width = 1800, height = 1200, units = "px", res = 200 )
    
    ##### Drawing oncoplot for each dataset
    plot.new()
    maftools::gisticBubblePlot(gistic = gisticInfo[[i]]$summary)
    while (!is.null(dev.list()))  dev.off()

    ##### Read in the bubble plot PNG files
    cat("![](",paste(outDir, "/MAF_gisticBubblePlot_", datasets.list[i], ".png", sep = ""),")")
    cat("<br/><br/><br/>")
  } else {
   cat(paste("Information about copy-number alterations for", datasets.list[i], "is not available\n\n", sep=" "))
  }
}
```

`r if ( runGistic ) { c("\n***\n#### Oncoplot") }`

`r if ( runGistic ) { c("Oncoplot(s) illustrating cytobands (labeled on the left side) altered across samples (columns). The frequency of alterations in corresponding cytobands are indicated on the rigth side. Significantly amplified and deleted cytobands are presented as <span style=\"color:#ff0000\">red</span> and <span style=\"color:#02d653\">green</span> cells, respectively.") }`

```{r gistic_oncoplot, comment = NA, message=FALSE, warning=FALSE, results="asis", eval = runGistic }
for ( i in 1:length(gisticFiles) ) {
  if ( gisticInfo[[i]]$status ) {
    
    ##### Save the plot as PNG
    png( file = paste(outDir, "/MAF_gisticOncoplot_", datasets.list[i], ".png", sep = ""), width = 1800, height = 1200, units = "px", res = 200 )
      
    ##### Drawing oncoplot for each dataset
    plot.new()
    par(mar=c(4,4,2,0.5), oma=c(1.5,2,2,1))
    maftools::gisticOncoPlot(gistic = gisticInfo[[i]]$summary, showTumorSampleBarcodes = params$samples_show, barcode_mar = 5)
    while (!is.null(dev.list()))  dev.off()
    
    ##### Read in the oncoplot PNG files
    cat("![](",paste(outDir, "/MAF_gisticOncoplot_", datasets.list[i], ".png", sep = ""),")")
    cat("<br/><br/><br/>")
  } else {
   cat(paste("Information about copy-number alterations for", datasets.list[i], "is not available\n\n", sep=" "))
  }
}
```

`r if ( runGistic ) { c("\n***\n### Tables {.tabset .tabset-fade}") }`

`r if ( runGistic ) { c("#### Samples summary") }`

`r if ( runGistic ) { c("Table(s) summarising samples in individual datasets. Each table contains per-sample information (rows) about *number of amplified* and *deleted* regions (columns).") }`

```{r gistic_sample_summary, comment = NA, message=FALSE, warning=FALSE, eval = runGistic}
##### Present a sample table in the html report
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- DT::datatable( data = maftools::getSampleSummary(gisticInfo[[i]]$summary), caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
        DT::formatStyle( columns = names(maftools::getSampleSummary(gisticInfo[[i]]$summary)), 'text-align' = 'center' )
}

##### Print a list of htmlwidgets
widges.list

##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()
```

`r if ( runGistic ) { c("\n***\n#### Genes summary") }`

`r if ( runGistic ) { c("Table(s) summarising genes withing detected significantly amplified and deleted regions in individual datasets. Each table contains per-gene information (rows) about *number of amplified* and *deleted* regions (columns). The last column contains the *number of samples with alterations* in the corresponding gene.") }`

```{r gistic_gene_summary, comment = NA, message=FALSE, warning=FALSE, eval = runGistic}
##### Present a gene table in the html report
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- DT::datatable(data = maftools::getGeneSummary(gisticInfo[[i]]$summary), caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','FixedColumns','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, fixedColumns = list(leftColumns = 2), deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
        DT::formatStyle( columns = names(maftools::getGeneSummary(gisticInfo[[i]]$summary)), 'text-align' = 'center' )
}

##### Print a list of htmlwidgets
widges.list
```

***

## Drug-gene interactions {.tabset .tabset-fade}

Drug–gene interactions and gene druggability information compiled from [Drug Gene Interaction database](http://www.dgidb.org/){target="_blank"} (see paper [DGIdb: mining the druggable genome](https://www.ncbi.nlm.nih.gov/pubmed/24122041){target="_blank"} by Griffith *et al*).

### Plot

Plot(s) illustrating potential druggable gene categories along with up to top 5 genes involved in them.
 
```{r drug_gene_interactions_plot, comment = NA, message=FALSE, warning=FALSE, results = 'asis'}
##### Check for drug–gene interactions and gene druggability information compiled from Drug Gene Interaction database (http://www.dgidb.org/).
##### Create a list to store drug-gene interaction info
dgi <- vector("list", length(mafFiles))
names(dgi) <- datasets.list

for ( i in 1:length(mafFiles) ) {
  
  cat(paste("\n\n <b>", datasets.list[i], "</b> \n\n", sep=" "))
  
  dgi[[i]] = drugInteractions(maf = mafInfo[[i]], top = top_genes_no[[i]], fontSize = 0.75)
  cat("<br/><br/>")
}
```

***

### Table

Table(s) with known/reported drugs to interact with potential druggable genes.
 
```{r drug_gene_interactions_table, comment = NA, message=FALSE, warning=FALSE}
##### Extract results for known/reported drugs to interact with specified genes
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  dgi.table <- as.data.frame(dgi[[i]])[, colnames(dgi[[i]]) %!in% "gene_long_name" ]
  
  widges.list[[i]] <- DT::datatable(data = dgi.table, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;', htmltools::strong(datasets.list[i])), filter = "top", extensions = c('Buttons','FixedColumns','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, fixedColumns = list(leftColumns = 2), deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
        DT::formatStyle( columns = names(dgi.table), 'text-align' = 'center' )
}

##### Print a list of htmlwidgets
widges.list

```

`r if ( runMafCompare ) { c("\n***\n## Cohorts comparison {.tabset .tabset-fade}") }`

`r if ( runMafCompare ) { c(paste0("The two cohorts, **", gsub(",", ", ", params$datasets) , "**, were compared to evaluate differences in their mutation patterns. *Fisher’s exact test* was performed to detect differentially mutated genes.")) }`

```{r runMafCompare, comment = NA, message=FALSE, warning=FALSE, eval = runMafCompare }
##### Considering only genes which are mutated in at-least in 5 samples in one of the cohort to avoid bias due to genes mutated in single sample
mafCompare.res <- maftools::mafCompare(m1 = mafInfo[[1]], m2 = mafInfo[[2]], m1Name = datasets.list[1], m2Name = datasets.list[2], minMut = 5)
```

`r if ( runMafCompare ) { c("### Forest plot") }`

`r if ( runMafCompare ) { c(paste0("*[Forest plot](https://en.wikipedia.org/wiki/Forest_plot){target=\"_blank\"}* presenting *Fisher’s exact test* results for significantly (*P-value* < ", params$maf_comp_p, " and *FDR* < ", params$maf_comp_fdr, ") differentially mutated genes. *Y-axis* shows differentially mutated genes and *X-axis* shows corresponding log10 converted *odds ratio* values. \\* *p* < 0.001;  \\*\\* *p* < 0.01;  \\*\\*\\* *p* < 0.05")) }`

```{r forest_plot, comment = NA, message=FALSE, warning=FALSE, eval = runMafCompare }
##### Considering only genes which are mutated in at-least in 5 samples in one of the cohort to avoid bias due to genes mutated in single sample
try(maftools::forestPlot(mafCompareRes = mafCompare.res, pVal = params$maf_comp_p, color = c('royalblue', 'maroon'), genefontSize = 0.7), silent = TRUE)
```

`r if ( runMafCompare ) { c("\n***\n### Co-onco plots") }`

`r if ( runMafCompare ) { c(paste0("Side by side [Oncoplot]s illustrating different types of mutations observed in significantly (*P-value* < ", params$maf_comp_p, " and *FDR* < ", params$maf_comp_fdr, ") differentially mutated genes across samples in individual datasets.")) }`

```{r Coonco_plots, comment = NA, message=FALSE, warning=FALSE, eval = runMafCompare }
##### Considering only genes which are mutated in at-least in 5 samples in one of the cohort to avoid bias due to genes mutated in single sample
genes = mafCompare.res$results$Hugo_Symbol[ mafCompare.res$results$pval < params$maf_comp_p ]

maftools::coOncoplot(m1 = mafInfo[[1]], m2 = mafInfo[[2]], m1Name = datasets.list[1], m2Name = datasets.list[2], genes = genes, removeNonMutated = TRUE, clinicalFeatures1 = clinicalFeatures[[1]], clinicalFeatures2 = clinicalFeatures[[2]], sortByAnnotation1 = params$sort_by_annotation, sortByAnnotation2 = params$sort_by_annotation, titleFontSize = 1.2, colors = vc_cols, showSampleNames = params$samples_show)
```

`r if ( runMafCompare ) { c("\n***\n### Table") }`

`r if ( runMafCompare ) { c("Table with *Fisher’s exact test* results including *P-values*, *odds ratio* (*OR*), *lower/upper confidence interval* (*CI*) and *false positive rate* (*FDR*) values.") }`

```{r maf_compare_table, comment = NA, message=FALSE, warning=FALSE, eval = runMafCompare }
DT::datatable(data = mafCompare.res$results, caption = htmltools::tags$caption(style = 'caption-side: top; text-align: left;'), filter = "top", extensions = c('Buttons','FixedColumns','Scroller'), options = list(pageLength = 10, dom = 'Bfrtip', buttons = c('excel', 'csv', 'pdf','copy','colvis'), scrollX = TRUE, fixedColumns = list(leftColumns = 2), deferRender = TRUE, scrollY = 200, scroller = TRUE), width = 800,  escape = FALSE ) %>%
        DT::formatStyle( columns = names(mafCompare.res$results), 'text-align' = 'center' ) %>%
        DT::formatRound(columns = c("or", "ci.up", "ci.low"), 1) %>%
        DT::formatRound(columns = c("pval", "adjPval"), 6)
```

```{r maf_compare_table_legend, comment = NA, message=FALSE, warning=FALSE, eval = runMafCompare, results = 'asis'}
cat("\n<details>\n")
cat("\n<summary>Table legend</summary>\n")
cat("*pval* - *P-value*\n")
cat("*or* - *odds ratio* (*OR*)\n")
cat("*ci.up* - *upper confidence interval* (*CI*)\n")
cat("*ci.low* - *lower CI*\n")
cat("*adjPval* - *false positive rate* (*FDR*)")
cat("\n</details>\n")
```

***

## Mutation maps {.tabset .tabset-fade}

```{r pchange_field_check, comment = NA, message=FALSE, warning=FALSE}
##### Check if the protein change field is present in any of the MAFs
pchangeStatus <- FALSE

for ( i in 1:length(mafFiles) ) {
  pchange = c('HGVSp_Short', 'Protein_Change', 'AAChange')
        
  ##### Define the column with protein change info
  pchange = pchange[pchange %in% colnames(mafInfo[[i]]@data)]
  
  ##### Check if the protein change field is not empty
  if ( any(!is.na(as.data.frame(mafInfo[[i]]@data)[ , pchange  ]))  ) {
    pchangeStatus <- TRUE
    
    ##### Create directory for pdf files
    mutationMapsDir <- paste0(normalizePath(outDir), "/", "MAF_mutation_maps")
    
    if ( !file.exists(mutationMapsDir) ){
      dir.create(mutationMapsDir, recursive=TRUE)
    }
  }
}
```

```{r top_mutated_genes, comment = NA, message=FALSE, warning=FALSE}
##### Get the most frequently mutated genes across all datasets
mutRate.top.datasets <- NULL

for ( i in 1:length(mafFiles) ) {
  genes.top <- maftools::getGeneSummary(mafInfo[[i]])[c(1:top_genes_no[[i]]), Hugo_Symbol]
  
  ##### Add genes of interest to the list (if specified)
  if ( params$genes_list != "none" ){
    genes.top <- unique(c(genes.top, genes_list_goi.nonsyn[[i]]))
  }
    
  sampleSize <- as.numeric(mafInfo[[i]]@summary[ID %in% "Samples", summary])
  
  mutRate.top <- round(getGeneSummary(x = mafInfo[[i]])[ Hugo_Symbol %in% genes.top, MutatedSamples]/sampleSize * 100, digits = 2)
  names(mutRate.top) <- genes.top
  
  mutRate.top.datasets <- c(mutRate.top.datasets, mutRate.top)
}

mutRate.top.datasets <- sort(mutRate.top.datasets, decreasing = TRUE)[unique(names(mutRate.top.datasets))]
```

```{r prot_structre, comment = NA, message=FALSE, warning=FALSE, eval=pchangeStatus}
##### Get list of proteins for which structure is available within maftools
gff = system.file('extdata', 'protein_domains.RDs', package = 'maftools')
gff = readRDS(file = gff)
```

Many oncogenes have a preferential sites which are mutated more often than any other locus. These spots are considered to be mutational hot-spots and lollipop plots can be used to display them along with rest of the mutations. Presented lollipop plot(s) show mutation spots on protein structure for the **`r sum(unlist(top_genes_no))` most frequently mutated genes** (mutated in at least `r gsub(",", "% and ", params$genes_min)`% of patients, respectively) across all datasets. <span style="color:#ff0000">NOTE</span>, plots are available only for MAF file(s) containing field with amino acid changes details. The longest transcript is used if multiple transcripts are available.

```{r lollipop_plots, echo=FALSE, comment = NA, message=FALSE, warning=FALSE, results="asis", eval=FALSE}
##### Generate lollipop plot for each dataset for top mutated genes
if ( pchangeStatus ) {
  
  output_plot <- list()

  for( i in 1:length(mutRate.top.datasets) ){
    cat("\n### ", names(mutRate.top.datasets)[i], "\n")
    if ( nrow(gff[HGNC %in% names(mutRate.top.datasets)[i]]) != 0 ) {
      try(lollipops.datasets(mafInfo = mafInfo, datasets  = datasets.list, gene  = names(mutRate.top.datasets)[i]), silent = TRUE)
      cat("\n\n***\n")
    } else {
      cat(paste("The protein structure for protein encoded by", names(mutRate.top.datasets)[i], "is not available\n\n", sep=" "))
      cat("\n***\n")
    }
  }
  
  ##### Clean the space
  rm(list = ls(pattern='^output*'))

} else {
  cat("\n***\n")
  cat("\nThis section was skipped since the field with **amino acid changes details** in provided MAF(s) is **NOT AVAILABLE**!\n")
  cat("\n***\n")
}
```

## Mutation details {.tabset .tabset-fade}

Tables with detailed information, as as provided in corresponding MAF file(s), for the **`r sum(unlist(top_genes_no))` most frequently mutated genes** (mutated in at least `r gsub(",", "% and ", params$genes_min)`% of patients, respectively) across all datasets, as well as for the **genes of interest** (if specified) in individual dataset(s).

<details>
<summary>Variants consequence definitions</summary>
<font size="2">

* **Non-synonymous variants** are defined as variants with the following consequences: *`r paste(params$nonSyn_list, collapse = ", ")`*.
* **Silent variants** are defined as variants with the following consequences: *`r paste(silent_categories, collapse = ", ")`*.

</font> 
</details>

### Recurrently mutated genes {.tabset .tabset-fade}

#### Non-synonymous

```{r details_mut_nonsynon, comment = NA, message=FALSE, warning=FALSE}
##### Provide detiles information for each dataset for user-defined number of top mutated genes
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- mut.details.datasets(mafInfo[i], datasets.list[i], maftools::getGeneSummary(mafInfo[[i]])$Hugo_Symbol[1:top_genes_no[[i]]], type = "nonsynonymous")
}

##### Print a list of htmlwidgets
widges.list

##### Add extra lines to make sure that this section doesn't overlap with the next one
cat("\n\n\n")
```

***

#### Silent

```{r details_mut_silent, comment = NA, message=FALSE, warning=FALSE}
##### Provide detiles information for each dataset for user-defined number of top mutated genes
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- mut.details.datasets(mafInfo[i], datasets.list[i], maftools::getGeneSummary(mafInfo[[i]])$Hugo_Symbol[1:top_genes_no[[i]]], type = "silent")
}

##### Print a list of htmlwidgets
widges.list

##### Add extra lines to make sure that this section doesn't overlap with the next one
cat("\n\n\n")
```

`r if ( goi_status ) { c("***") }`

`r if ( goi_status ) { c("### Genes of interest {.tabset .tabset-fade}") }`

`r if ( goi_status ) { c("#### Non-synonymous") }`

```{r details_mut_goi_nonsynon, comment = NA, message=FALSE, warning=FALSE, eval=goi_status}
##### Provide detiles information for each dataset for user-defined number of top mutated genes
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- mut.details.datasets(mafInfo[i], datasets.list[i], goi[[i]], type = "nonsynonymous")
}

##### Print a list of htmlwidgets
widges.list

##### Add extra lines to make sure that this section doesn't overlap with the next one
cat("\n\n\n")
```

`r if ( goi_status ) { c("***") }`

`r if ( goi_status ) { c("#### Silent") }`

```{r details_mut_goi_silent, comment = NA, message=FALSE, warning=FALSE, eval=goi_status }
##### Provide detiles information for each dataset for user-defined number of top mutated genes
##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

for ( i in 1:length(mafFiles) ) {
  widges.list[[i]] <- mut.details.datasets(mafInfo[i], datasets.list[i], goi[[i]], type = "silent")
}

##### Print a list of htmlwidgets
widges.list

##### Add extra lines to make sure that this section doesn't overlap with the next one
cat("\n\n\n")
```

***

## Heatmaps {.tabset .tabset-fade}

### Samples

Interactive heatmap(s) with colours indicating low (blue) and high (yellow) number of *various mutations types* (columns) detected in corresponding *samples* (rows). Samples are ordered by the number of mutations to facilitate identification of individuals with extreme mutation burden. 

```{r sample_summary_heatmap, comment = NA, message=FALSE, warning=FALSE, fig.width = 6, fig.height = 9}
suppressMessages(library(plotly))
suppressMessages(library(heatmaply))

##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

##### Display samples summary in a form of interactive heatmap
for ( i in 1:length(mafFiles) ) {

  sampleSummary <- data.frame(maftools::getSampleSummary(mafInfo[[i]]))
  rownames(sampleSummary) <-sampleSummary[,"Tumor_Sample_Barcode"]
  sampleSummary <- subset(sampleSummary, select=-c(Tumor_Sample_Barcode, total))

  ##### Generate interactive heatmap
  widges.list[[i]] <- heatmaply(sampleSummary, main = datasets.list[i], Rowv=NULL, Colv=NULL, scale="none", dendrogram="none", trace="none", hide_colorbar = FALSE, fontsize_row = 8, label_names=c("Sample","Mutation_type","Count")) %>%
  layout(width  = 900, height = 400, margin = list(l=150, r=10, b=80, t=50, pad=2), font = list(size=16), xaxis = list(tickfont=list(size=10)), yaxis = list(tickfont=list(size=10)))

  ##### Save the heatmap as html (PLOTLY)
  saveWidgetFix(as_widget(widges.list[[i]]), paste0(outDir, "/MAF_sample_summary_heatmap_", datasets.list[i], ".html"), selfcontained = TRUE)
}

##### Detach plotly package. Otherwise it clashes with other graphics devices
detach("package:heatmaply", unload=FALSE)
detach("package:plotly", unload=FALSE)

##### Print a list of htmlwidgets
widges.list
```

***

### Genes

Interactive heatmap(s) with colours indicating low (blue) and high (yellow) number of *various mutations types* (columns) detected in corresponding *genes* (rows). Genes are ordered by the number of reported mutations. The *total number of mutations* in individual genes, as well as the *number of samples with mutations* are also presented in the last three columns. <span style="color:#ff0000">NOTE</span>, for transparency only the **`r sum(unlist(top_genes_no))` most frequently mutated genes** (mutated in at least `r gsub(",", "% and ", params$genes_min)`% of patients, respectively) are presented.

```{r gene_summary_heatmap, comment = NA, message=FALSE, warning=FALSE, fig.width = 6, fig.height = 8}
suppressMessages(library(plotly))
suppressMessages(library(heatmaply))

##### Create a list for htmlwidgets
widges.list <- htmltools::tagList()

##### Display genes summary in a form of interactive heatmap
for ( i in 1:length(mafFiles) ) {

  geneSummary <- data.frame(maftools::getGeneSummary(mafInfo[[i]])[1:top_genes_no[[i]],])
  rownames(geneSummary) <-geneSummary[,"Hugo_Symbol"]
  geneSummary <- subset(geneSummary, select=-c(Hugo_Symbol))

  ##### Generate interactive heatmap
  widges.list[[i]]<- heatmaply(geneSummary, main = datasets.list[i], Rowv=NULL, Colv=NULL, scale="none", dendrogram="none", trace="none", hide_colorbar = FALSE, fontsize_row = 8, label_names=c("Gene","Mutation_type","Count")) %>%
  layout(width  = 900, height = 400, margin = list(l=150, r=10, b=80, t=50, pad=2), font = list(size=16), xaxis = list(tickfont=list(size=10)), yaxis = list(tickfont=list(size=10)))

  ##### Save the heatmap as html (PLOTLY)
  saveWidgetFix(as_widget(widges.list[[i]]), paste0(outDir, "/MAF_gene_summary_heatmap_", datasets.list[i], ".html"), selfcontained = TRUE)
}

##### Detach plotly package. Otherwise it clashes with other graphics devices
detach("package:heatmaply", unload=FALSE)
detach("package:plotly", unload=FALSE)

##### Print a list of htmlwidgets
widges.list
```

<!--This step is skipped for the moment as it is too computationally intensive.

***

### Pair-wise comparisons

Interactive heatmap of *adjusted p-values* from *Fisher Exact* test performed for all possible pair-wise comprisons to find differentially mutated genes between queried datasets and to aid identification of global differences in mutation patterns in corresponding MAF files. The rows and columns represent genes and datasets, respectively, and heatmap colours indicate the *Fisher Exact* test *adjusted p-values*. The colour key is on the left-hand side. **Low *p-values*** (yellow) **indicate differentially mutated genes between corresponding datasets**. Genes are clustered to facilite the indetification possible differences between individaul datasets. Note, only overlap of mutated genes reported across all MAF files is presented. This section is available for multiple MAFs and will be skipped if only one MAF file is provided.

--> 

```{r compare_mafs, message = FALSE, warning=FALSE, comment = NA, fig.width = 6, fig.height = 12, echo = FALSE, eval = FALSE}
##### Skip this section if only one MAF file is provided
if ( length(mafFiles) != 1 ) {
  
  suppressMessages(library(plotly))
  suppressMessages(library(heatmaply))
  
  #####  Create matrix of possible comparisons
  comb <- combn(levels(factor(datasets.list)), 2)
  
  #####  Get number of possible comparisons using the following formula:
  #
  # n!/((n-r)!(r!))
  #
  # n = the number of classes to compare
  # r = the number of elements for single comparison
  #
  ################################################################################
  
  combNo <- factorial(length(datasets.list))/(factorial(length(datasets.list)-2)*(factorial(2))) # n!/((n-r)!(r!))
  combNames <- NULL
  mafCompare.res <- vector("list", combNo)
  
  ##### Perform pair-wise datasets MAFs comparisons
  for (i in 1:combNo) {
  
    combNames[i] <- paste(comb[1,i], comb[2,i], sep=" vs ")
  
    mafCompare.res[[i]] <- maftools::mafCompare(m1 = mafInfo[[comb[1,i]]], m2 = mafInfo[[comb[2,i]]], m1Name = comb[1,i], m2Name = comb[2,i], minMut = 0)$results
  
    ##### Sort the data-frame by gene symbol
    mafCompare.res[[i]] <- mafCompare.res[[i]][order(mafCompare.res[[i]]$Hugo_Symbol)]
  
    rownames(mafCompare.res[[i]]) <- mafCompare.res[[i]]$Hugo_Symbol
  }
  
  ##### Extract the intersection of genes for the heatmap
  common_genes <-  Reduce(intersect, lapply(mafCompare.res, row.names))
  mafCompare.res.common <-  lapply(mafCompare.res, function(x) { x[row.names(x) %in% common_genes,] })
  
  ##### Extract adjusted p-values for each comparison, use genes names fow row names
  mafCompare.res.common.p <- do.call(cbind, lapply(mafCompare.res.common, function(x) x[, c("Hugo_Symbol", "adjPval")]))
  common_genes <- mafCompare.res.common.p$Hugo_Symbol
  mafCompare.res.common.p <- mafCompare.res.common.p[,-c("Hugo_Symbol")]
  mafCompare.res.common.p <- as.matrix(mafCompare.res.common.p)
  colnames(mafCompare.res.common.p) <- combNames
  rownames(mafCompare.res.common.p) <- common_genes
  
  ##### Cluster genes
  hr <- hclust(as.dist(dist(mafCompare.res.common.p, method="euclidean")), method="ward.D")
  
  ##### Generate interactive heatmap
  p <- heatmaply(mafCompare.res.common.p, Rowv=as.dendrogram(hr), Colv=NULL, viridis(n=256, alpha = 1, begin = 1, end = 0, option = "viridis"), scale="none", dendrogram="row", trace="none", hide_colorbar = FALSE, fontsize_row = 8, label_names=c("Gene","Comparison","P-value")) %>%
    layout(width  = 900, height = 900, margin = list(l=150, r=10, b=350, t=50, pad=2), font = list(size=16), xaxis = list(tickfont=list(size=10)), yaxis = list(tickfont=list(size=10)))
  p
  
  ##### Save the heatmap as html (PLOTLY)
  saveWidgetFix(as_widget(p), paste0(outDir, "/MAF_pair-wise_comparisons_heatmap.html"), selfcontained = TRUE)

  ##### Detach plotly package. Otherwise it clashes with other graphics devices
  detach("package:heatmaply", unload=FALSE)
  detach("package:plotly", unload=FALSE)
  
} else {
  cat(paste("This section was skipped since MAF for only one dataset (", datasets.list[i], ") was provided.\n\n", sep=" "))
}
```

```{r maf_visualisation_pdf, comment = NA, message=FALSE, warning=FALSE, warning=FALSE}
###### Save plots into PDF files
###### Generate separate file with plots for each dataset
for ( i in 1:length(mafFiles) ) {

  pdf( file = paste(outDir, "/MAF_summary_", datasets.list[i], ".pdf", sep = "") )

  ##### Plotting MAF summary
  par(mar=c(4,4,2,0.5), oma=c(1.5,2,2,1))
  plotmafSummary(maf = mafInfo[[i]], top = top_genes_no[[i]], rmOutlier = TRUE, addStat = 'median', dashboard = TRUE, titvRaw = FALSE)
  mtext("MAF summary", outer=TRUE,  cex=1, line=-0.5)

  ##### Drawing oncoplots for the top mutated genes in each dataset that has > 1 sample with non-synonymous variants
  if ( top_genes_no[[i]] > 1 ) {
    
    ##### Drawing oncoplots for the top mutated genes in each dataset
    plot.new()
    par(mar=c(1,4,2,0.5), oma=c(1.5,2,2,1))
    oncoplot(maf = mafInfo[[i]], top = top_genes_no[[i]], fontSize = 0.7, colbar_pathway = FALSE, removeNonMutated = FALSE, draw_titv = params$draw_titv, clinicalFeatures = clinicalFeatures[[i]], colors = vc_cols, includeColBarCN = FALSE, showTumorSampleBarcodes = params$samples_show, sampleOrder = sampleOrder, sortByAnnotation = params$sort_by_annotation)
  }
  
  ##### Drawing oncoplots for additional genes of interest (if specified)
  if ( goi_status && length(goi[[i]]) > 1 ) {

    ##### Drawing oncoplots for the top mutated genes in each dataset that has > 1 sample with non-synonymous variants
    if ( length(goi[[i]]) > 1 ) {
      
      ##### Check if the genes of interest have any non-synonymus mutations
      if ( length(top_genes_goi[[i]]) > 1  ) {
      
        plot.new()
        par(mar=c(1,4,2,0.5), oma=c(1.5,2,2,1))
        maftools::oncoplot(maf = mafInfo[[i]], genes = goi[[i]], fontSize = 0.7, colbar_pathway = FALSE, removeNonMutated = FALSE, draw_titv = params$draw_titv, clinicalFeatures = clinicalFeatures[[i]], colors = vc_cols, includeColBarCN = FALSE, showTumorSampleBarcodes = params$samples_show, barcode_mar = 5, keepGeneOrder = params$genes_keep_order, sampleOrder = sampleOrder, sortByAnnotation = params$sort_by_annotation)
        dev.off()
      }
    }
  }
  
  ##### Drawing oncoplots for additional genes of interest (if specified)
  if ( pathways_status ) {
    ##### Drawing oncoplots for genes involved in the pathways of interest
    plot.new()
    par(mar=c(1,4,2,0.5), oma=c(1.5,2,2,1))
    maftools::oncoplot(maf = mafInfo.pathways[[i]], fontSize = 0.7, colbar_pathway = FALSE, removeNonMutated = FALSE, draw_titv = params$draw_titv, clinicalFeatures = clinicalFeatures[[i]], colors = vc_cols, includeColBarCN = FALSE, showTumorSampleBarcodes = params$samples_show, barcode_mar = 5, sampleOrder = sampleOrder, sortByAnnotation = params$sort_by_annotation, pathways = params$pathways, gene_mar = 6)
    dev.off()
  }
  
  ##### Drawing distribution plots of the transitions and transversions
  titv.info = titv(maf = mafInfo[[i]], plot = FALSE, useSyn = TRUE)

  try(plotTiTv(res = titv.info), silent = TRUE)
  mtext("Transition and transversions distribution", outer=TRUE,  cex=1, line=-1.5)

  ##### Compare mutation load against TCGA cohorts
  ##### Compare mutation load against TCGA cohorts
  if ( gisticInfo[[i]]$status ) {
    tcgaCompare(maf = mafInfo_tcgaCompare[[i]], cohortName = datasets.list[i], primarySite=TRUE)
  mtext("Mutation load in TCGA cohorts", outer=TRUE,  cex=1, line=-0.5)
  invisible(dev.off())
  } else {
    tcgaCompare(maf = mafInfo[[i]], cohortName = datasets.list[i], primarySite=TRUE)
  mtext("Mutation load in TCGA cohorts", outer=TRUE,  cex=1, line=-0.5)
  invisible(dev.off())
  }
}

##### Add extra lines to make sure that this section doesn't overlap with the next one
```

***

## Addendum

<details>
<summary>Parameters</summary>
<font size="2">

```{r params_info, comment = NA}
for ( i in 1:length(params) ) {
  cat(paste("Parameter: ", names(params)[i], "\nValue: ", paste(unlist(params[i]), collapse = ","), "\n\n", sep=""))
}
```

</font>
</details>

<details>
<summary>Session info</summary>
<font size="2">

```{r sessioninfo, comment = NA}
devtools::session_info()
```

</font>
</details>