diff --git a/docs/search.json b/docs/search.json index 7b5c676..131c395 100644 --- a/docs/search.json +++ b/docs/search.json @@ -46,14 +46,42 @@ "href": "tutorials/forestplot.html#prepare-setup", "title": "Tutorial: The prettiest forestplot", "section": "Prepare setup", - "text": "Prepare setup\nWe will first import the necessary packages:\n\n# Load packages\nlibrary(GAMBLR.data)\nlibrary(GAMBLR.viz)\nlibrary(dplyr)\n\nNext, we will get some data to display. The metadata is expected to be a data frame with one required column: sample_id and another column that will contain sample annotations according to the comparison group. In this example, we will use as example the data set and variant calls from the study that identified genetic subgroup of Burkitt lymphoma (BL).\n\nmetadata <- get_gambl_metadata() %>%\n filter(cohort == \"BL_Thomas\")\n\nNext, we will obtain the coding mutations that will be used in the plotting. The data is a data frame in a standartized maf format.\n\nmaf <- get_ssm_by_samples(\n these_samples_metadata = metadata,\n tool_name = \"publication\",\n projection = \"hg38\"\n)\n\n# How does it look like?\ndim(maf)\n\n[1] 47043 45\n\nhead(maf) %>%\n select(\n Tumor_Sample_Barcode,\n Hugo_Symbol,\n Variant_Classification\n )\n\n Tumor_Sample_Barcode Hugo_Symbol Variant_Classification\n1: Akata CPTP Missense_Mutation\n2: Akata FNDC10 Missense_Mutation\n3: Akata MORN1 Missense_Mutation\n4: Akata MEGF6 Missense_Mutation\n5: Akata NPHP4 Silent\n6: Akata GPR157 Missense_Mutation\n\n\nFor the purpose of this tutorial, we will focus on a small subset of genes known to be significantly mutated in BL.\n\ngenes <- lymphoma_genes_bl_v_latest$Gene\nhead(genes)\n\n[1] \"ALPK2\" \"ARHGEF1\" \"ARID1A\" \"B2M\" \"BACH2\" \"BCL10\" \n\n\nNow we have our metadata and mutations we want to explore, so we are ready to start visualizing the data." + "text": "Prepare setup\nWe will first import the necessary packages:\n\n# Load packages\nlibrary(GAMBLR.data)\nlibrary(GAMBLR.utils)\nlibrary(GAMBLR.viz)\nlibrary(tibble)\nlibrary(dplyr)\n\nNext, we will get some data to display. The metadata is expected to be a data frame with one required column: sample_id and another column that will contain sample annotations according to the comparison group. In this example, we will use as example the data set and variant calls from the study that identified genetic subgroup of Burkitt lymphoma (BL).\n\nmetadata <- get_gambl_metadata() %>%\n filter(cohort == \"BL_Thomas\")\n\nNext, we will obtain the coding mutations that will be used in the plotting. The data is a data frame in a standartized maf format.\n\nmaf <- get_ssm_by_samples(\n these_samples_metadata = metadata,\n tool_name = \"publication\",\n projection = \"hg38\"\n)\n\n# How does it look like?\ndim(maf)\n\n[1] 47043 45\n\nhead(maf) %>%\n select(\n Tumor_Sample_Barcode,\n Hugo_Symbol,\n Variant_Classification\n )\n\n Tumor_Sample_Barcode Hugo_Symbol Variant_Classification\n1: Akata CPTP Missense_Mutation\n2: Akata FNDC10 Missense_Mutation\n3: Akata MORN1 Missense_Mutation\n4: Akata MEGF6 Missense_Mutation\n5: Akata NPHP4 Silent\n6: Akata GPR157 Missense_Mutation\n\n\nFor the purpose of this tutorial, we will focus on a small subset of genes known to be significantly mutated in BL.\n\ngenes <- lymphoma_genes_bl_v_latest$Gene\nhead(genes)\n\n[1] \"ALPK2\" \"ARHGEF1\" \"ARID1A\" \"B2M\" \"BACH2\" \"BCL10\" \n\n\nNow we have our metadata and mutations we want to explore, so we are ready to start visualizing the data." }, { "objectID": "tutorials/forestplot.html#the-default-forest-plot", "href": "tutorials/forestplot.html#the-default-forest-plot", "title": "Tutorial: The prettiest forestplot", "section": "The default forest plot", - "text": "The default forest plot\nThe forest plot is ready to be called with the default parameters after just providing the metadata and data frame with mutations in standard maf format. Here is an example of the output with all default parameters:\n\ncomparison_column <- \"EBV_status_inf\" # character of column name for comparison\nfp <- prettyForestPlot(\n metadata = metadata,\n maf = maf,\n genes = genes,\n comparison_column = comparison_column\n)" + "text": "The default forest plot\nThe forest plot is ready to be called with the default parameters after just providing the metadata and data frame with mutations in standard maf format. Here is an example of the output with all default parameters:\n\ncomparison_column <- \"EBV_status_inf\" # character of column name for comparison\nfp <- prettyForestPlot(\n metadata = metadata,\n maf = maf,\n genes = genes,\n comparison_column = comparison_column\n)\n\nThe output of the function is a list containing the following objects: - fisher: a data frame with detailed statistics of the Fisher’s test for each gene - mutmat: a binary matrix used for the Fisher’s test - forest: a ggplot2 object with the forest plot of the ORs from the Fisher’s test for each gene - bar: a ggplot2 object wiht mutation frequencies for each Gene - arranged: a display item where both the forest and bar plots are nicely arranged side-by-side\n\nnames(fp)\n\n[1] \"fisher\" \"forest\" \"bar\" \"arranged\" \"mutmat\"" + }, + { + "objectID": "tutorials/forestplot.html#report-only-significant-differences", + "href": "tutorials/forestplot.html#report-only-significant-differences", + "title": "Tutorial: The prettiest forestplot", + "section": "Report only significant differences", + "text": "Report only significant differences\nBy default, all of the genes of interest are reported in the output. After the Fisher’s test is performed, the prettyForestPlot also calculates FDR and we can use it to only report significant differences by providing a significance cutoff with the parameter max_q:\n\nmax_q <- 0.1 # only those qith Q value <= 0.1 will be reported\nfp <- prettyForestPlot(\n metadata = metadata,\n maf = maf,\n genes = genes,\n comparison_column = comparison_column,\n max_q = max_q\n)\n\nWe now can take a look at what genes are passing the significance cutoff:\n\nfp$arranged" + }, + { + "objectID": "tutorials/forestplot.html#comparing-categories-with-more-than-two-groups", + "href": "tutorials/forestplot.html#comparing-categories-with-more-than-two-groups", + "title": "Tutorial: The prettiest forestplot", + "section": "Comparing categories with more than two groups", + "text": "Comparing categories with more than two groups\nAs the prettyForestPlot construcst the 2x2 contingency tables to run Fisher’s test to find significant differences, it can only operate on comparing 2 groups between themselves - but what if you have more than that and want to see the difference between some of them? To handle this scenario, we can take advantage of the comparison_values parameter, which will be used to subset the metadata to only requested groups and only perform testing and plotting on this subset. Let’s see it in action:\n\ncomparison_column <- \"genetic_subgroup\" # change the comparison column\ncomparison_values <- c(\"IC-BL\", \"Q53-BL\")\nfp <- prettyForestPlot(\n metadata = metadata,\n maf = maf,\n genes = genes,\n comparison_column = comparison_column,\n comparison_values = comparison_values,\n max_q = max_q\n)\n\nfp$arranged\n\n\n\n\nThis plot is exactly reproducing the Supplemmental Figure 12D from the Thomas et al study!" + }, + { + "objectID": "tutorials/forestplot.html#separating-genes-with-hotspots", + "href": "tutorials/forestplot.html#separating-genes-with-hotspots", + "title": "Tutorial: The prettiest forestplot", + "section": "Separating genes with hotspots", + "text": "Separating genes with hotspots\nWe can additionally separate hotspots from the other mutations and compare those separately. First, we need to annotate the maf data, for which we will use the annotate_hotspots from GAMBLR family. This function will add a new column to the maf named hot_spot indicating whether or not the specific mutation is in the hotspot region.\n\n# Annotate hotspots\nmaf <- annotate_hotspots(maf)\n\n# What are the hotspots?\nmaf %>%\n filter(hot_spot) %>%\n select(Hugo_Symbol, hot_spot) %>%\n table()\n\n< table of extent 0 x 0 >\n\n\n\n\n\n\n\n\nNote\n\n\n\nThe GAMBLR.data version of the annotate_hotspots only handles very specific genes and does not have functionality to annotate all hotspots.\n\n\nOh no! Looks like there is no hotspots in this maf data. This does not make sense, so what happened? Aha, the hotspot annotation in GAMBLR.data works only on the data in grch37 projection. But our maf is in hg38, so what should we do? One way is to lift the maf data to another projection using the UCSC’s liftOver, and GAMBLR family has exactly the function that serves this purpose:\n\nmaf_grch37 <- liftover(\n maf,\n mode = \"maf\",\n target_build = \"grch37\"\n) %>%\nmutate(Chromosome = gsub(\"chr\", \"\", Chromosome)) %>%\nselect(-hot_spot) # since it is empty we can just drop it\n\nCan we annotate the hotspots now?\n\nmaf_grch37 <- annotate_hotspots(maf_grch37)\n\n# What are the hotspots?\nmaf_grch37 %>%\n filter(hot_spot) %>%\n select(Hugo_Symbol, hot_spot) %>%\n table()\n\n hot_spot\nHugo_Symbol TRUE\n CREBBP 1\n EZH2 1\n FOXO1 60\n MYD88 2\n STAT6 4\n\n\nIndeed, the hotspots are properly annotated once we have maf in correct projection. Now, we can simply toggle the separate_hotspots parameter to perform separate comparisons within hotspots:\n\ncomparison_column <- \"EBV_status_inf\"\nfp <- prettyForestPlot(\n metadata = metadata,\n maf = maf_grch37,\n genes = genes,\n comparison_column = comparison_column,\n max_q = max_q,\n separate_hotspots = TRUE\n)\n\nfp$arranged" + }, + { + "objectID": "tutorials/forestplot.html#using-binary-matrix-as-input", + "href": "tutorials/forestplot.html#using-binary-matrix-as-input", + "title": "Tutorial: The prettiest forestplot", + "section": "Using binary matrix as input", + "text": "Using binary matrix as input\nSometimes it might be useful to have different input format instead of maf - for example, binary matrix of features. Can we use the prettyForestPlot in this case? Yes, sure we can!\nFirst, let’s construct the binary matrix. We will supplement our maf with the non-coding mutations to look at the aSHM regions in addition to coding mutations, and this will already give us the data in correct projection:\n\nmaf <- get_ssm_by_samples(\n these_samples_metadata = metadata\n)\nmaf$Variant_Classification %>% table\n\n.\n 3'Flank 3'UTR 5'Flank \n 1457 513 2957 \n 5'UTR Frame_Shift_Del Frame_Shift_Ins \n 1102 124 97 \n IGR In_Frame_Del In_Frame_Ins \n 457 40 14 \n Intron Missense_Mutation Nonsense_Mutation \n 44397 1859 286 \n Nonstop_Mutation RNA Silent \n 6 74 481 \n Splice_Region Splice_Site Translation_Start_Site \n 148 114 22 \n\n\nNow we convert this maf into binary matrix:\n\n# Generate binary matrix\ncoding_matrix <- get_coding_ssm_status(\n these_samples_metadata = metadata,\n maf_data = maf,\n gene_symbols = genes,\n include_hotspots = TRUE,\n review_hotspots = TRUE\n)\n\nNext, supplement this with the matrix of non-coding mutation across aSHM regions\n\n# Use aSHM regions from GAMBLR.data\nregions_bed <- somatic_hypermutation_locations_GRCh37_v0.2\n\n# Add convenient name column\nregions_bed <- regions_bed %>%\n mutate(\n name = paste(gene, region, sep = \"-\")\n )\n\n# Generate matrix of mutations per each site\nashm_matrix <- get_ashm_count_matrix(\n regions_bed = regions_bed,\n maf_data = maf,\n these_samples_metadata = metadata\n)\n\n# Binarize matrix using arbitrary 3 muts/region cutoff\nashm_matrix[ashm_matrix <= 3] = 0\nashm_matrix[ashm_matrix > 3] = 1\nashm_matrix <- ashm_matrix %>%\n rownames_to_column(\"sample_id\")\n\nWe can now combine both coding and non-coding features into single matrix:\n\nfeature_matrix <- left_join(\n coding_matrix,\n ashm_matrix\n)\n\n# Drop any fearures absent across at least 10 samples to clean any noise\nfeature_matrix <- feature_matrix %>%\n select_if(is.numeric) %>%\n select(where(~ sum(. > 0, na.rm = TRUE) >= 10)) %>%\n bind_cols(\n feature_matrix %>% select(sample_id),\n .\n )\n\nNow we can provide the binary matrix to the prettyForestPlot and regenerate the Supplemmental Figure 12C from the Thomas et al study!\n\ncomparison_column <- \"genetic_subgroup\"\ncomparison_values <- c(\"DGG-BL\", \"Q53-BL\")\nfp <- prettyForestPlot(\n metadata = metadata,\n mutmat = feature_matrix,\n genes = genes,\n comparison_column = comparison_column,\n comparison_values = comparison_values,\n max_q = max_q\n)\n\nfp$arranged" }, { "objectID": "install.html", @@ -270,7 +298,7 @@ "href": "tutorials/oncoplot.html#tallying-mutation-burden", "title": "Tutorial: The prettiest oncoplot", "section": "Tallying mutation burden", - "text": "Tallying mutation burden\nPreviously, we noted that the maf data we were supplying to the prettyOncoplot was not subset to contain only coding mutations, and also discouraged from pre-filtering maf to a subset of genes if we are insterested only looking at some of them. Here is why this is important: if we want to layer on additional information like total mutation burden per sample, any subsetting or filtering of the maf would generate inaccurate and misleading results. Therefore, prettyOncoplot handles all of this for you! So if we were to go ahead with tallying the total mutation burden, we could just add some additional parameters to the function call:\n\nhideTopBarplot <- FALSE # will display TMB annotations at the top\ntally_all_mutations <- TRUE # will tally all mutations per sample\n\nprettyOncoplot(\n these_samples_metadata = metadata,\n maf_df = maf,\n metadataColumns = metadataColumns,\n metadataBarHeight = metadataBarHeight,\n metadataBarFontsize = metadataBarFontsize,\n fontSizeGene = fontSizeGene,\n legendFontSize = legendFontSize,\n sortByColumns = metadataColumns,\n genes = genes,\n splitGeneGroups = gene_groups,\n splitColumnName = \"pathology\",\n groupNames = c(\"Follicular lymphoma\", \"DLBCL\", \"COMFL\"),\n hideTopBarplot = hideTopBarplot,\n tally_all_mutations = tally_all_mutations\n)\n\n\n\n\n\n\n\n\n\n\nDid you know?\n\n\n\nIf the dynamic range of total mutation burden is too big and there are some extreme outliers, the bar chart at the top of the oncoplot can be capped of at any numeric value by providing tally_all_mutations_max parameter.\n\n\nWhat if we want to additionally force the ordering based on the total number of mutations, so they are nicely arranged in the decreasing order? We can do so by adding the mutation counts as one of the annotation tracks and using it to sort the samples:\n\n# Count all muts to define the order of samples\ntotal_mut_burden <- maf %>%\n count(Tumor_Sample_Barcode)\n\nhead(total_mut_burden)\n\n Tumor_Sample_Barcode n\n1: 01-20260T 71\n2: 02-13135T 98\n3: 02-20170T 67\n4: 02-22991T 53\n5: 03-34157T 26\n6: 04-24937T 146\n\n# Add this info to metadata\nmetadata <- left_join(\n metadata,\n total_mut_burden\n) \n\nprettyOncoplot(\n these_samples_metadata = metadata,\n maf_df = maf,\n metadataColumns = metadataColumns,\n metadataBarHeight = metadataBarHeight,\n metadataBarFontsize = metadataBarFontsize,\n fontSizeGene = fontSizeGene,\n legendFontSize = legendFontSize,\n sortByColumns = c(\"n\", metadataColumns),\n genes = genes,\n splitGeneGroups = gene_groups,\n splitColumnName = \"pathology\",\n groupNames = c(\"Follicular lymphoma\", \"DLBCL\", \"COMFL\"),\n hideTopBarplot = hideTopBarplot,\n tally_all_mutations = tally_all_mutations,\n numericMetadataColumns = \"n\",\n arrange_descending = TRUE\n)\n\n\n\n\n\n\n\n\n\n\nNote\n\n\n\nWe have modified here the sortByColumns parameter, and provided two additional parameters numericMetadataColumns and arrange_descending.\n\n\n\n\n\n\n\n\nDid you know?\n\n\n\nThe top annotation and n annotation at the bottom are the same thing? Remove n from the legend by adding hide_annotations = \"n\" and remove display of annotation track while keeping the ordering by adding hide_annotations_tracks = TRUE." + "text": "Tallying mutation burden\nPreviously, we noted that the maf data we were supplying to the prettyOncoplot was not subset to contain only coding mutations, and also discouraged from pre-filtering maf to a subset of genes if we are insterested only looking at some of them. Here is why this is important: if we want to layer on additional information like total mutation burden per sample, any subsetting or filtering of the maf would generate inaccurate and misleading results. Therefore, prettyOncoplot handles all of this for you! So if we were to go ahead with tallying the total mutation burden, we could just add some additional parameters to the function call:\n\nhideTopBarplot <- FALSE # will display TMB annotations at the top\ntally_all_mutations <- TRUE # will tally all mutations per sample\n\nprettyOncoplot(\n these_samples_metadata = metadata,\n maf_df = maf,\n metadataColumns = metadataColumns,\n metadataBarHeight = metadataBarHeight,\n metadataBarFontsize = metadataBarFontsize,\n fontSizeGene = fontSizeGene,\n legendFontSize = legendFontSize,\n sortByColumns = metadataColumns,\n genes = genes,\n splitGeneGroups = gene_groups,\n splitColumnName = \"pathology\",\n groupNames = c(\"Follicular lymphoma\", \"DLBCL\", \"COMFL\"),\n hideTopBarplot = hideTopBarplot,\n tally_all_mutations = tally_all_mutations\n)\n\n\n\n\n\n\n\n\n\n\nDid you know?\n\n\n\nIf the dynamic range of total mutation burden is too big and there are some extreme outliers, the bar chart at the top of the oncoplot can be capped of at any numeric value by providing tally_all_mutations_max parameter.\n\n\nWhat if we want to additionally force the ordering based on the total number of mutations, so they are nicely arranged in the decreasing order? We can do so by adding the mutation counts as one of the annotation tracks and using it to sort the samples:\n\n# Count all muts to define the order of samples\ntotal_mut_burden <- maf %>%\n count(Tumor_Sample_Barcode)\n\nhead(total_mut_burden)\n\n Tumor_Sample_Barcode n\n1: 01-20260T 71\n2: 02-13135T 98\n3: 02-20170T 67\n4: 02-22991T 53\n5: 03-34157T 26\n6: 04-24937T 146\n\n# Add this info to metadata\nmetadata <- left_join(\n metadata,\n total_mut_burden\n)\n\nprettyOncoplot(\n these_samples_metadata = metadata,\n maf_df = maf,\n metadataColumns = metadataColumns,\n metadataBarHeight = metadataBarHeight,\n metadataBarFontsize = metadataBarFontsize,\n fontSizeGene = fontSizeGene,\n legendFontSize = legendFontSize,\n sortByColumns = c(\"n\", metadataColumns),\n genes = genes,\n splitGeneGroups = gene_groups,\n splitColumnName = \"pathology\",\n groupNames = c(\"Follicular lymphoma\", \"DLBCL\", \"COMFL\"),\n hideTopBarplot = hideTopBarplot,\n tally_all_mutations = tally_all_mutations,\n numericMetadataColumns = \"n\",\n arrange_descending = TRUE\n)\n\n\n\n\n\n\n\n\n\n\nNote\n\n\n\nWe have modified here the sortByColumns parameter, and provided two additional parameters numericMetadataColumns and arrange_descending.\n\n\n\n\n\n\n\n\nDid you know?\n\n\n\nThe top annotation and n annotation at the bottom are the same thing? Remove n from the legend by adding hide_annotations = \"n\" and remove display of annotation track while keeping the ordering by adding hide_annotations_tracks = TRUE." }, { "objectID": "tutorials/oncoplot.html#annotating-significance-of-mutation-frequencies-in-sample-groups", diff --git a/docs/tutorials/forestplot.html b/docs/tutorials/forestplot.html index 2c6249a..c1ce5ce 100644 --- a/docs/tutorials/forestplot.html +++ b/docs/tutorials/forestplot.html @@ -273,6 +273,10 @@

On this page

@@ -303,8 +307,10 @@

Prepare setup

# Load packages
 library(GAMBLR.data)
-library(GAMBLR.viz)
-library(dplyr)
+library(GAMBLR.utils) +library(GAMBLR.viz) +library(tibble) +library(dplyr)

Next, we will get some data to display. The metadata is expected to be a data frame with one required column: sample_id and another column that will contain sample annotations according to the comparison group. In this example, we will use as example the data set and variant calls from the study that identified genetic subgroup of Burkitt lymphoma (BL).

@@ -362,6 +368,227 @@

The default forest comparison_column = comparison_column )

+

The output of the function is a list containing the following objects: - fisher: a data frame with detailed statistics of the Fisher’s test for each gene - mutmat: a binary matrix used for the Fisher’s test - forest: a ggplot2 object with the forest plot of the ORs from the Fisher’s test for each gene - bar: a ggplot2 object wiht mutation frequencies for each Gene - arranged: a display item where both the forest and bar plots are nicely arranged side-by-side

+
+
names(fp)
+
+
[1] "fisher"   "forest"   "bar"      "arranged" "mutmat"  
+
+
+ +
+

Report only significant differences

+

By default, all of the genes of interest are reported in the output. After the Fisher’s test is performed, the prettyForestPlot also calculates FDR and we can use it to only report significant differences by providing a significance cutoff with the parameter max_q:

+
+
max_q <- 0.1 # only those qith Q value <= 0.1 will be reported
+fp <- prettyForestPlot(
+    metadata = metadata,
+    maf = maf,
+    genes = genes,
+    comparison_column = comparison_column,
+    max_q = max_q
+)
+
+

We now can take a look at what genes are passing the significance cutoff:

+
+
fp$arranged
+
+

+
+
+
+
+

Comparing categories with more than two groups

+

As the prettyForestPlot construcst the 2x2 contingency tables to run Fisher’s test to find significant differences, it can only operate on comparing 2 groups between themselves - but what if you have more than that and want to see the difference between some of them? To handle this scenario, we can take advantage of the comparison_values parameter, which will be used to subset the metadata to only requested groups and only perform testing and plotting on this subset. Let’s see it in action:

+
+
comparison_column <- "genetic_subgroup" # change the comparison column
+comparison_values <- c("IC-BL", "Q53-BL")
+fp <- prettyForestPlot(
+    metadata = metadata,
+    maf = maf,
+    genes = genes,
+    comparison_column = comparison_column,
+    comparison_values = comparison_values,
+    max_q = max_q
+)
+
+fp$arranged
+
+

+
+
+

This plot is exactly reproducing the Supplemmental Figure 12D from the Thomas et al study!

+
+
+

Separating genes with hotspots

+

We can additionally separate hotspots from the other mutations and compare those separately. First, we need to annotate the maf data, for which we will use the annotate_hotspots from GAMBLR family. This function will add a new column to the maf named hot_spot indicating whether or not the specific mutation is in the hotspot region.

+
+
# Annotate hotspots
+maf <- annotate_hotspots(maf)
+
+# What are the hotspots?
+maf %>%
+    filter(hot_spot) %>%
+    select(Hugo_Symbol, hot_spot) %>%
+    table()
+
+
< table of extent 0 x 0 >
+
+
+
+
+
+ +
+
+Note +
+
+
+

The GAMBLR.data version of the annotate_hotspots only handles very specific genes and does not have functionality to annotate all hotspots.

+
+
+

Oh no! Looks like there is no hotspots in this maf data. This does not make sense, so what happened? Aha, the hotspot annotation in GAMBLR.data works only on the data in grch37 projection. But our maf is in hg38, so what should we do? One way is to lift the maf data to another projection using the UCSC’s liftOver, and GAMBLR family has exactly the function that serves this purpose:

+
+
maf_grch37 <- liftover(
+    maf,
+    mode = "maf",
+    target_build = "grch37"
+) %>%
+mutate(Chromosome = gsub("chr", "", Chromosome)) %>%
+select(-hot_spot) # since it is empty we can just drop it
+
+

Can we annotate the hotspots now?

+
+
maf_grch37 <- annotate_hotspots(maf_grch37)
+
+# What are the hotspots?
+maf_grch37 %>%
+    filter(hot_spot) %>%
+    select(Hugo_Symbol, hot_spot) %>%
+    table()
+
+
           hot_spot
+Hugo_Symbol TRUE
+     CREBBP    1
+     EZH2      1
+     FOXO1    60
+     MYD88     2
+     STAT6     4
+
+
+

Indeed, the hotspots are properly annotated once we have maf in correct projection. Now, we can simply toggle the separate_hotspots parameter to perform separate comparisons within hotspots:

+
+
comparison_column <- "EBV_status_inf"
+fp <- prettyForestPlot(
+    metadata = metadata,
+    maf = maf_grch37,
+    genes = genes,
+    comparison_column = comparison_column,
+    max_q = max_q,
+    separate_hotspots = TRUE
+)
+
+fp$arranged
+
+

+
+
+
+
+

Using binary matrix as input

+

Sometimes it might be useful to have different input format instead of maf - for example, binary matrix of features. Can we use the prettyForestPlot in this case? Yes, sure we can!

+

First, let’s construct the binary matrix. We will supplement our maf with the non-coding mutations to look at the aSHM regions in addition to coding mutations, and this will already give us the data in correct projection:

+
+
maf <- get_ssm_by_samples(
+    these_samples_metadata = metadata
+)
+maf$Variant_Classification %>% table
+
+
.
+               3'Flank                  3'UTR                5'Flank 
+                  1457                    513                   2957 
+                 5'UTR        Frame_Shift_Del        Frame_Shift_Ins 
+                  1102                    124                     97 
+                   IGR           In_Frame_Del           In_Frame_Ins 
+                   457                     40                     14 
+                Intron      Missense_Mutation      Nonsense_Mutation 
+                 44397                   1859                    286 
+      Nonstop_Mutation                    RNA                 Silent 
+                     6                     74                    481 
+         Splice_Region            Splice_Site Translation_Start_Site 
+                   148                    114                     22 
+
+
+

Now we convert this maf into binary matrix:

+
+
# Generate binary matrix
+coding_matrix <- get_coding_ssm_status(
+    these_samples_metadata = metadata,
+    maf_data = maf,
+    gene_symbols = genes,
+    include_hotspots = TRUE,
+    review_hotspots = TRUE
+)
+
+

Next, supplement this with the matrix of non-coding mutation across aSHM regions

+
+
# Use aSHM regions from GAMBLR.data
+regions_bed <- somatic_hypermutation_locations_GRCh37_v0.2
+
+# Add convenient name column
+regions_bed <- regions_bed %>%
+    mutate(
+        name = paste(gene, region, sep = "-")
+    )
+
+# Generate matrix of mutations per each site
+ashm_matrix <- get_ashm_count_matrix(
+    regions_bed = regions_bed,
+    maf_data = maf,
+    these_samples_metadata = metadata
+)
+
+# Binarize matrix using arbitrary 3 muts/region cutoff
+ashm_matrix[ashm_matrix <= 3] = 0
+ashm_matrix[ashm_matrix > 3] = 1
+ashm_matrix <- ashm_matrix %>%
+    rownames_to_column("sample_id")
+
+

We can now combine both coding and non-coding features into single matrix:

+
+
feature_matrix <- left_join(
+    coding_matrix,
+    ashm_matrix
+)
+
+# Drop any fearures absent across at least 10 samples to clean any noise
+feature_matrix <- feature_matrix %>%
+    select_if(is.numeric) %>%
+    select(where(~ sum(. > 0, na.rm = TRUE) >= 10)) %>%
+    bind_cols(
+        feature_matrix %>% select(sample_id),
+        .
+    )
+
+

Now we can provide the binary matrix to the prettyForestPlot and regenerate the Supplemmental Figure 12C from the Thomas et al study!

+
+
comparison_column <- "genetic_subgroup"
+comparison_values <- c("DGG-BL", "Q53-BL")
+fp <- prettyForestPlot(
+    metadata = metadata,
+    mutmat = feature_matrix,
+    genes = genes,
+    comparison_column = comparison_column,
+    comparison_values = comparison_values,
+    max_q = max_q
+)
+
+fp$arranged
+
+

+
+
diff --git a/docs/tutorials/forestplot.qmd b/docs/tutorials/forestplot.qmd index 8f4c663..55f7069 100644 --- a/docs/tutorials/forestplot.qmd +++ b/docs/tutorials/forestplot.qmd @@ -2,8 +2,8 @@ title: "Tutorial: The prettiest forestplot" warning: false message: false -fig.width: 8 -fig.height: 5 +fig.width: 10 +fig.height: 8 fig.align: "center" --- @@ -28,7 +28,9 @@ We will first import the necessary packages: ```{r load_packages} # Load packages library(GAMBLR.data) +library(GAMBLR.utils) library(GAMBLR.viz) +library(tibble) library(dplyr) ``` @@ -83,8 +85,6 @@ providing the metadata and data frame with mutations in standard maf format. Here is an example of the output with all default parameters: ```{r default} -#| fig-width: 10 -#| fig-height: 15 comparison_column <- "EBV_status_inf" # character of column name for comparison fp <- prettyForestPlot( metadata = metadata, @@ -107,3 +107,215 @@ arranged side-by-side ```{r} names(fp) ``` + +## Report only significant differences + +By default, all of the genes of interest are reported in the output. After the +Fisher's test is performed, the `prettyForestPlot` also calculates FDR and we +can use it to only report significant differences by providing a significance +cutoff with the parameter `max_q`: + +```{r fdr} +max_q <- 0.1 # only those qith Q value <= 0.1 will be reported +fp <- prettyForestPlot( + metadata = metadata, + maf = maf, + genes = genes, + comparison_column = comparison_column, + max_q = max_q +) +``` + +We now can take a look at what genes are passing the significance cutoff: +```{r fdr_plot} +fp$arranged +``` + +## Comparing categories with more than two groups + +As the `prettyForestPlot` construcst the 2x2 contingency tables to run Fisher's +test to find significant differences, it can only operate on comparing 2 groups +between themselves - but what if you have more than that and want to see the +difference between some of them? +To handle this scenario, we can take advantage of the `comparison_values` +parameter, which will be used to subset the metadata to only requested groups +and only perform testing and plotting on this subset. Let's see it in action: + +```{r comp_groups} +comparison_column <- "genetic_subgroup" # change the comparison column +comparison_values <- c("IC-BL", "Q53-BL") +fp <- prettyForestPlot( + metadata = metadata, + maf = maf, + genes = genes, + comparison_column = comparison_column, + comparison_values = comparison_values, + max_q = max_q +) + +fp$arranged +``` + +This plot is exactly reproducing the Supplemmental Figure 12D from the +[Thomas et al](https://doi.org/10.1182/blood.2022016534) study! + +## Separating genes with hotspots + +We can additionally separate hotspots from the other mutations and compare those +separately. First, we need to annotate the maf data, for which we will use the +`annotate_hotspots` from GAMBLR family. This function will add a new column to +the maf named `hot_spot` indicating whether or not the specific mutation is in +the hotspot region. + +```{r annotate_maf} +# Annotate hotspots +maf <- annotate_hotspots(maf) + +# What are the hotspots? +maf %>% + filter(hot_spot) %>% + select(Hugo_Symbol, hot_spot) %>% + table() +``` +::: {.callout-note} +The GAMBLR.data version of the `annotate_hotspots` only handles very specific +genes and does not have functionality to annotate all hotspots. +::: + +Oh no! Looks like there is no hotspots in this maf data. This does not make +sense, so what happened? Aha, the hotspot annotation in GAMBLR.data works only +on the data in grch37 projection. But our maf is in hg38, so what should we do? +One way is to lift the maf data to another projection using the UCSC's liftOver, +and GAMBLR family has exactly the function that serves this purpose: + +```{r lift_maf} +maf_grch37 <- liftover( + maf, + mode = "maf", + target_build = "grch37" +) %>% +mutate(Chromosome = gsub("chr", "", Chromosome)) %>% +select(-hot_spot) # since it is empty we can just drop it + +``` + +Can we annotate the hotspots now? + +```{r} +maf_grch37 <- annotate_hotspots(maf_grch37) + +# What are the hotspots? +maf_grch37 %>% + filter(hot_spot) %>% + select(Hugo_Symbol, hot_spot) %>% + table() +``` + +Indeed, the hotspots are properly annotated once we have maf in correct +projection. Now, we can simply toggle the `separate_hotspots` parameter to +perform separate comparisons within hotspots: + +```{r comp_hotspots} +comparison_column <- "EBV_status_inf" +fp <- prettyForestPlot( + metadata = metadata, + maf = maf_grch37, + genes = genes, + comparison_column = comparison_column, + max_q = max_q, + separate_hotspots = TRUE +) + +fp$arranged +``` + +## Using binary matrix as input + +Sometimes it might be useful to have different input format instead of maf - for example, binary matrix of features. Can we use the `prettyForestPlot` in this +case? Yes, sure we can! + +First, let's construct the binary matrix. We will supplement our maf with the +non-coding mutations to look at the aSHM regions in addition to coding +mutations, and this will already give us the data in correct projection: +```{r bin_mat} +maf <- get_ssm_by_samples( + these_samples_metadata = metadata +) +maf$Variant_Classification %>% table +``` + +Now we convert this maf into binary matrix: +```{r cod_mat} +# Generate binary matrix +coding_matrix <- get_coding_ssm_status( + these_samples_metadata = metadata, + maf_data = maf, + gene_symbols = genes, + include_hotspots = TRUE, + review_hotspots = TRUE +) + +``` + +Next, supplement this with the matrix of non-coding mutation across aSHM regions + +```{r ashm_mat} +# Use aSHM regions from GAMBLR.data +regions_bed <- somatic_hypermutation_locations_GRCh37_v0.2 + +# Add convenient name column +regions_bed <- regions_bed %>% + mutate( + name = paste(gene, region, sep = "-") + ) + +# Generate matrix of mutations per each site +ashm_matrix <- get_ashm_count_matrix( + regions_bed = regions_bed, + maf_data = maf, + these_samples_metadata = metadata +) + +# Binarize matrix using arbitrary 3 muts/region cutoff +ashm_matrix[ashm_matrix <= 3] = 0 +ashm_matrix[ashm_matrix > 3] = 1 +ashm_matrix <- ashm_matrix %>% + rownames_to_column("sample_id") +``` + + +We can now combine both coding and non-coding features into single matrix: +```{r mat} +feature_matrix <- left_join( + coding_matrix, + ashm_matrix +) + +# Drop any fearures absent across at least 10 samples to clean any noise +feature_matrix <- feature_matrix %>% + select_if(is.numeric) %>% + select(where(~ sum(. > 0, na.rm = TRUE) >= 10)) %>% + bind_cols( + feature_matrix %>% select(sample_id), + . + ) +``` + +Now we can provide the binary matrix to the `prettyForestPlot` and regenerate +the Supplemmental Figure 12C from the +[Thomas et al](https://doi.org/10.1182/blood.2022016534) study! + +```{r comp_mat} +comparison_column <- "genetic_subgroup" +comparison_values <- c("DGG-BL", "Q53-BL") +fp <- prettyForestPlot( + metadata = metadata, + mutmat = feature_matrix, + genes = genes, + comparison_column = comparison_column, + comparison_values = comparison_values, + max_q = max_q +) + +fp$arranged +``` diff --git a/docs/tutorials/forestplot_files/figure-html/comp_groups-1.png b/docs/tutorials/forestplot_files/figure-html/comp_groups-1.png new file mode 100644 index 0000000..34539de Binary files /dev/null and b/docs/tutorials/forestplot_files/figure-html/comp_groups-1.png differ diff --git a/docs/tutorials/forestplot_files/figure-html/comp_hotspots-1.png b/docs/tutorials/forestplot_files/figure-html/comp_hotspots-1.png new file mode 100644 index 0000000..5b2ee37 Binary files /dev/null and b/docs/tutorials/forestplot_files/figure-html/comp_hotspots-1.png differ diff --git a/docs/tutorials/forestplot_files/figure-html/comp_mat-1.png b/docs/tutorials/forestplot_files/figure-html/comp_mat-1.png new file mode 100644 index 0000000..961ec12 Binary files /dev/null and b/docs/tutorials/forestplot_files/figure-html/comp_mat-1.png differ diff --git a/docs/tutorials/forestplot_files/figure-html/fdr_plot-1.png b/docs/tutorials/forestplot_files/figure-html/fdr_plot-1.png new file mode 100644 index 0000000..40fd300 Binary files /dev/null and b/docs/tutorials/forestplot_files/figure-html/fdr_plot-1.png differ diff --git a/docs/tutorials/onco_matrix.txt b/docs/tutorials/onco_matrix.txt deleted file mode 100644 index 7439763..0000000 --- a/docs/tutorials/onco_matrix.txt +++ /dev/null @@ -1,11 +0,0 @@ -FL1001T1 FL1002T1 FL1004T1 FL1005T1 FL1006T1 FL1007T1 FL1008T1 FL1009T1 FL1010T1 FL1011T1 FL1013T1 FL1014T1 FL1015T1 FL1016T1 FL1017T1 FL1018T1 FL1020T1 14-24907T 14-30670T 14-37865T 15-29305T 16-19402T POG707T FL1001T2 FL1002T2 FL1004T2 FL1005T2 FL1006T2 FL1007T2 FL1008T2 FL1009T2 FL1010T2 FL1011T2 FL1013T2 FL1014T2 FL1015T2 FL1016T2 FL1017T2 FL1018T2 FL1019T2 FL1020T2 15-16852T 15-37466T SP59292 SP59420 SP59316 SP116604 SP116622 SP59352 SP116679 SP116608 SP59340 SP59308 SP59416 SP59320 SP116645 SP59436 SP116674 SP59380 SP116649 SP59464 SP116638 SP116718 SP59356 SP116683 SP116723 SP116616 SP116703 SP59432 SP116720 FL2001T1 FL2002T1 FL2003T1 FL2004T1 FL2005T1 FL2006T1 FL2007T1 FL2008T1 FL3001T1 FL3002T1 FL3003T1 FL3004T1 FL3005T1 FL3006T1 FL3007T1 FL3008T1 FL3009T1 FL3010T1 FL3011T1 FL3012T1 FL3013T1 FL3014T1 FL3015T1 FL3016T1 FL3017T1 FL3018T1 FL3019T1 FL3020T1 13-19570T 13-26597T 13-29091T 13-27960T 13-34919T 99-13520T 13-40593T 04-38964T 13-40370T 13-43956T 14-11777T 14-11009T 14-13480T 14-11427T 14-13213T 01-20260T 14-15505T 14-25416T 14-26632T 14-28286T 14-29644T 14-29140T 14-32185T 14-32899T 14-32922T 14-34508T 14-34800T 12-32967T 14-35030T 14-34590T 14-34708T 14-38639T 07-34776T 15-10675T 15-14583T 15-15253T 15-14453T 15-14813T 15-16885T 14-27524T 15-18916T 15-17849T 14-41250T 96-11779T 11-28845T 15-30123T 15-30563T 15-37079T 15-36416T 15-36675T 15-39657T 15-33862T 15-41277T 15-39521T 15-40296T 15-42543T 11-34915T 16-10805T 16-13504T 16-20119T 16-27229T 16-30371T 16-32417T 16-37777T SP193910 SP193364 SP193808 SP193093 SP194212 SP194205 SP124984 SP193828 SP193965 SP194065 SP193954 SP194134 SP194083 SP194043 SP193992 SP194173 SP192882 SP193801 SP192988 SP194238 SP194077 SP193650 SP193655 SP193186 SP193057 SP192811 SP192863 SP193040 SP193354 SP193816 SP193950 SP193777 SP193570 SP193017 SP193120 SP193925 SP193326 SP193205 SP193543 SP59424 SP193855 SP124967 SP192804 SP124963 SP193993 SP194108 SP193720 SP116654 SP116706 13-22818T 13-26601T 13-31210T 14-16707T 14-13959T 14-20962T 14-16281T 14-25466T 14-27873T 14-32442T 96-31596T 14-33262T 14-37722T 14-36022T 14-41461T 15-11617T 09-31233T 15-13365T 15-13383T 15-15757T 14-35026T 15-21654T 15-38154T 09-37629T 15-43891T 04-24937T 10-31625T 16-18029T 15-43657T 16-29329T 10-40676T 10-10826T 07-30628T 09-12737T 05-12939T 05-23110T 05-24904T 05-25439T 05-32947T 06-10398T 06-19919T 06-22057T 06-23907T 06-24915T 06-30025T 06-33777T 95-32814T 98-22532T 09-11467T 05-32150T 09-21480T 08-15460T 02-13135T 10-27154T 05-17793T 05-22052T 08-19764T 09-12864T 07-31833T 11-21727T 99-13280T 05-21634T SP116630 SP116618 SP59412 SP59448 SP116670 SP124971 SP59312 SP116663 SP116627 SP116610 SP116657 SP59400 SP116624 SP124981 SP116701 SP116668 SP116726 SP116709 15-26538T 16-31791T 16-23208T 02-20170T 05-24006T 05-25674T 06-34043T 05-18426T SP124957 SP193258 SP192997 SP193005 SP193766 SP192856 SP194080 SP124975 SP194228 SP193375 SP193976 SP192798 SP193725 SP193945 SP193546 SP192940 SP192850 SP193025 SP194234 SP194216 SP193512 SP192765 SP192993 SP192815 SP192970 SP192800 SP193914 SP124979 SP59452 SP59460 SP116659 15-18723T SP116620 SP59364 SP116694 SP116715 SP116614 SP193312 SP193656 SP59408 SP194035 SP193794 SP192839 SP193957 SP193764 SP116642 SP116697 16-16192T 07-35482T SP59304 09-16981T SP124969 16-11636T 13-30451T 16-16723T SP116676 15-31924T 99-27137T SP59368 06-14634T SP192767 SP116648 17-36275T 15-34472T FL1019T1 SP193934 SP116690 14-35632T SP193967 SP193337 FL1003T2 11-35935T 14-11247T FL1003T1 09-41082T SP192870 SP193229 15-10535T 16-13732T 15-24306T SP194143 17-23504T SP124973 09-33003T 14-23891T 15-24058T 16-27413T 89-62169T 14-33436T 07-25012T SP59348 SP59456 SP124959 SP124977 06-30145T SP192833 15-12532T SP194195 SP116606 06-15256T FL1012T2 16-17861T 17-12136T 05-24395T SP116686 SP193420 16-18623T 81-52884T SP116635 SP59300 06-16716T 15-29858T SP59280 SP193701 SP116688 06-25674T FL1012T1 02-22991T SP59360 06-11535T 12-29259T SP193347 SP59324 SP193467 SP193744 SP193300 04-29264T SP193528 SP194053 SP116672 SP193684 03-34157T SP193170 10-39333T SP59376 SP59372 08-25894T SP124983 04-28140T 05-24561T 14-13938T SP193450 -EZH2 Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -CREBBP In_Frame_Del Missense_Mutation Missense_Mutation Missense_Mutation In_Frame_Del Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit Multi_Hit Missense_Mutation Missense_Mutation Missense_Mutation In_Frame_Del Missense_Mutation Missense_Mutation Missense_Mutation In_Frame_Del Multi_Hit Missense_Mutation Multi_Hit Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit Missense_Mutation Nonsense_Mutation Multi_Hit Missense_Mutation Frame_Shift_Ins Missense_Mutation Missense_Mutation Missense_Mutation Frame_Shift_Del Missense_Mutation Frame_Shift_Del Missense_Mutation Multi_Hit Splice_Site Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Nonsense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Splice_Site Missense_Mutation Missense_Mutation Nonsense_Mutation Splice_Site Missense_Mutation Missense_Mutation Splice_Site Splice_Site Splice_Site Missense_Mutation Missense_Mutation Frame_Shift_Del Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Frame_Shift_Ins Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit Multi_Hit Multi_Hit Missense_Mutation Multi_Hit Multi_Hit Missense_Mutation Missense_Mutation Missense_Mutation Nonsense_Mutation Frame_Shift_Ins Missense_Mutation Missense_Mutation Multi_Hit Nonsense_Mutation In_Frame_Del Splice_Site Multi_Hit Missense_Mutation In_Frame_Del Missense_Mutation Multi_Hit Missense_Mutation Missense_Mutation Nonsense_Mutation Multi_Hit Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Nonsense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit In_Frame_Del In_Frame_Del Missense_Mutation Missense_Mutation Nonsense_Mutation Missense_Mutation Missense_Mutation In_Frame_Del Missense_Mutation Multi_Hit Missense_Mutation Missense_Mutation Multi_Hit Missense_Mutation In_Frame_Del Missense_Mutation Nonsense_Mutation Missense_Mutation Missense_Mutation Nonsense_Mutation Missense_Mutation In_Frame_Del Multi_Hit Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Nonsense_Mutation Splice_Site Missense_Mutation Multi_Hit Missense_Mutation Splice_Site Nonsense_Mutation Frame_Shift_Del Nonsense_Mutation Missense_Mutation Nonsense_Mutation Multi_Hit Nonsense_Mutation Frame_Shift_Del Missense_Mutation Multi_Hit Missense_Mutation Multi_Hit Missense_Mutation Multi_Hit In_Frame_Del Splice_Site Frame_Shift_Ins In_Frame_Del Multi_Hit Missense_Mutation Frame_Shift_Del Multi_Hit Nonsense_Mutation Splice_Site Missense_Mutation Missense_Mutation Missense_Mutation Nonsense_Mutation 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -KMT2D Nonsense_Mutation Missense_Mutation Multi_Hit Multi_Hit Missense_Mutation Missense_Mutation Nonsense_Mutation Multi_Hit Multi_Hit Frame_Shift_Ins Frame_Shift_Del Multi_Hit Multi_Hit Multi_Hit Multi_Hit Missense_Mutation Multi_Hit Multi_Hit Missense_Mutation Multi_Hit Nonsense_Mutation Multi_Hit Multi_Hit Frame_Shift_Ins Multi_Hit Multi_Hit Frame_Shift_Ins Missense_Mutation Frame_Shift_Del Frame_Shift_Del Multi_Hit Frame_Shift_Del Frame_Shift_Del Multi_Hit Splice_Site Multi_Hit Nonsense_Mutation Splice_Site Multi_Hit Nonsense_Mutation Splice_Site Multi_Hit Multi_Hit Multi_Hit Multi_Hit Nonsense_Mutation Nonsense_Mutation Frame_Shift_Del Nonsense_Mutation Nonsense_Mutation Frame_Shift_Del Multi_Hit Multi_Hit Multi_Hit Nonsense_Mutation Multi_Hit Nonsense_Mutation Nonsense_Mutation Frame_Shift_Ins Nonsense_Mutation Multi_Hit Frame_Shift_Ins Multi_Hit Missense_Mutation Multi_Hit Nonsense_Mutation Frame_Shift_Del Frame_Shift_Ins Nonsense_Mutation Nonsense_Mutation Nonsense_Mutation Multi_Hit Frame_Shift_Del Nonsense_Mutation Multi_Hit Nonsense_Mutation Multi_Hit Multi_Hit Multi_Hit Multi_Hit Multi_Hit Frame_Shift_Ins Multi_Hit Frame_Shift_Del Multi_Hit Multi_Hit Multi_Hit Nonsense_Mutation Multi_Hit Multi_Hit Multi_Hit Nonsense_Mutation Frame_Shift_Del Nonsense_Mutation Multi_Hit Multi_Hit Nonsense_Mutation Missense_Mutation Nonsense_Mutation Multi_Hit Multi_Hit Multi_Hit Multi_Hit Multi_Hit Multi_Hit Multi_Hit Multi_Hit Multi_Hit Multi_Hit Frame_Shift_Ins Multi_Hit Nonsense_Mutation Frame_Shift_Del Nonsense_Mutation Nonsense_Mutation Frame_Shift_Del Nonsense_Mutation Frame_Shift_Ins Frame_Shift_Ins Frame_Shift_Del Multi_Hit Multi_Hit Splice_Site Splice_Site Multi_Hit Multi_Hit Frame_Shift_Del Nonsense_Mutation Multi_Hit Nonsense_Mutation Multi_Hit Nonsense_Mutation Multi_Hit Frame_Shift_Del Frame_Shift_Del Multi_Hit Frame_Shift_Del Missense_Mutation Missense_Mutation Frame_Shift_Ins Missense_Mutation Frame_Shift_Del Nonsense_Mutation Multi_Hit Nonsense_Mutation Frame_Shift_Ins Frame_Shift_Del Nonsense_Mutation Nonsense_Mutation Splice_Site Multi_Hit Multi_Hit Multi_Hit Frame_Shift_Del Missense_Mutation Multi_Hit Missense_Mutation Missense_Mutation Nonsense_Mutation Multi_Hit Nonsense_Mutation Multi_Hit Frame_Shift_Ins Multi_Hit Nonsense_Mutation Frame_Shift_Del Nonsense_Mutation Multi_Hit Splice_Site Multi_Hit Frame_Shift_Ins Missense_Mutation Nonsense_Mutation Frame_Shift_Del Multi_Hit Nonsense_Mutation Frame_Shift_Del Nonsense_Mutation Nonsense_Mutation Missense_Mutation Multi_Hit Multi_Hit Frame_Shift_Ins Nonsense_Mutation Nonsense_Mutation Multi_Hit Frame_Shift_Del Nonsense_Mutation Missense_Mutation Multi_Hit Frame_Shift_Del Nonsense_Mutation Nonsense_Mutation Frame_Shift_Del Multi_Hit Multi_Hit Nonsense_Mutation Multi_Hit Nonsense_Mutation Missense_Mutation Nonsense_Mutation Frame_Shift_Del Missense_Mutation Multi_Hit Splice_Site 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -TP53 Missense_Mutation Frame_Shift_Del Missense_Mutation Frame_Shift_Del Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit Splice_Site Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit Missense_Mutation Missense_Mutation Multi_Hit Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Splice_Site Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit Missense_Mutation Splice_Site Multi_Hit Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit Frame_Shift_Del Missense_Mutation Missense_Mutation 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -ATP6V1B2 Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -RRAGC Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation In_Frame_Del Missense_Mutation Missense_Mutation 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -MEF2B Missense_Mutation Translation_Start_Site Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Frame_Shift_Del Missense_Mutation Missense_Mutation Multi_Hit Multi_Hit Splice_Site Splice_Site Missense_Mutation Missense_Mutation Missense_Mutation Nonsense_Mutation Frame_Shift_Ins Splice_Site Nonsense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit Multi_Hit 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -MYD88 Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -CD79B Missense_Mutation Multi_Hit Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Splice_Site Splice_Site Splice_Site Multi_Hit Splice_Site Missense_Mutation Splice_Site Multi_Hit Missense_Mutation Splice_Site Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Missense_Mutation Multi_Hit Missense_Mutation 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -VMA21 Nonsense_Mutation Missense_Mutation Nonsense_Mutation Nonsense_Mutation Nonsense_Mutation Missense_Mutation Missense_Mutation Nonsense_Mutation Nonsense_Mutation Nonsense_Mutation Nonsense_Mutation 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 diff --git a/docs/tutorials/oncoplot.html b/docs/tutorials/oncoplot.html index 78cb138..9388f63 100644 --- a/docs/tutorials/oncoplot.html +++ b/docs/tutorials/oncoplot.html @@ -660,7 +660,7 @@

Tallying mutation metadata <- left_join( metadata, total_mut_burden -) +) prettyOncoplot( these_samples_metadata = metadata, diff --git a/docs/tutorials/oncoplot.qmd b/docs/tutorials/oncoplot.qmd index c8644f9..4095b69 100644 --- a/docs/tutorials/oncoplot.qmd +++ b/docs/tutorials/oncoplot.qmd @@ -328,7 +328,7 @@ prettyOncoplot( ## Did you know? If the dynamic range of total mutation burden is too big and there are some extreme outliers, the bar chart at the top of the oncoplot can be capped of at -any numeric value by providing `tally_all_mutations_max` parameter. +any numeric value by providing `tally_all_mutations_max` parameter. ::: What if we want to additionally force the ordering based on the total number of @@ -348,7 +348,7 @@ head(total_mut_burden) metadata <- left_join( metadata, total_mut_burden -) +) prettyOncoplot( these_samples_metadata = metadata, @@ -612,7 +612,7 @@ my_oncoplot = grid.grabExpr( Now, it is ready for us to arrange in multi-panel figure. We can use the forest plot we already looked at as an example, and put it to the right of the -oncoplot: +oncoplot: ```{r multi_panel} #| fig-height: 8 #| fig-width: 13 @@ -689,3 +689,8 @@ multipanel_figure <- ggarrange( multipanel_figure ``` + +```{r cleaup_om} +#| echo: false +unlink("onco_matrix.txt") +``` diff --git a/docs/tutorials/oncoplot_files/figure-html/tmb_order_by_meta-1.png b/docs/tutorials/oncoplot_files/figure-html/tmb_order_by_meta-1.png index a501a0f..3913a87 100644 Binary files a/docs/tutorials/oncoplot_files/figure-html/tmb_order_by_meta-1.png and b/docs/tutorials/oncoplot_files/figure-html/tmb_order_by_meta-1.png differ