Skip to content

Commit

Permalink
Merge pull request #46 from CostaLab/devel
Browse files Browse the repository at this point in the history
Updated vignettes
  • Loading branch information
grasshoffm authored Nov 12, 2024
2 parents e880916 + 4802126 commit 032dd81
Show file tree
Hide file tree
Showing 247 changed files with 684 additions and 1,396 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
100755 → 100644
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: sigurd
Type: Package
Title: Single cell Genotyping Using RNA Data
Version: 0.3.6
Version: 0.3.8
Authors@R: c(
person(given = "Martin",
family = "Grasshoff",
Expand Down
Empty file modified NAMESPACE
100755 → 100644
Empty file.
Empty file modified R/AllelFrequencyFoldChange.R
100755 → 100644
Empty file.
Empty file modified R/AmpliconSupplementing.R
100755 → 100644
Empty file.
Empty file modified R/CalculateAlleleFrequency.R
100755 → 100644
Empty file.
Empty file modified R/CalculateAltReads.R
100755 → 100644
Empty file.
Empty file modified R/CalculateConsensus.R
100755 → 100644
Empty file.
20 changes: 14 additions & 6 deletions R/CalculateCorrelationPValue.R
100755 → 100644
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,13 @@
#'@description
#'We perform the correlation of SNVs and calculate the P values.
#'@importFrom stats cor.test
#'@param variant_values The fraction values you are analysing. A vector.
#'@param variant_values The fraction values you are analyzing. A vector.
#'@param other_mutation All other variants you have. A vector of variant names.
#'@param all_variants_list List of fraction values for all the variants you want to compare your variant with.
#'@param min_intersecting_cells Minimum number of intersecting cells. Correlations with less than this will not be performed.
#'@param value_type Are we using consensus or other information?
#'@export
CalculateCorrelationPValue <- function(variant_values, other_mutation, all_variants_list, min_intersecting_cells = 5){
CalculateCorrelationPValue <- function(variant_values, other_mutation, all_variants_list, value_type = "consensus", min_intersecting_cells = 5){
other_variant_values <- all_variants_list[[other_mutation]]
if(sum(names(variant_values) %in% names(other_variant_values)) == 0){
#print("No intersection")
Expand All @@ -28,10 +29,17 @@ CalculateCorrelationPValue <- function(variant_values, other_mutation, all_varia
return(result)
} else if(length(variant_values) > 2){
result <- stats::cor.test(variant_values, other_variant_values)
cells_som_alt <- sum(variant_values == 1)
cells_som_ref <- sum(variant_values == 0)
cells_MT_alt <- sum(other_variant_values == 1)
cells_MT_ref <- sum(other_variant_values == 0)
if(value_type == "consensus"){
cells_som_alt <- sum(variant_values == 1)
cells_som_ref <- sum(variant_values == 0)
cells_MT_alt <- sum(other_variant_values == 1)
cells_MT_ref <- sum(other_variant_values == 0)
} else{
cells_som_alt <- sum(variant_values > 0)
cells_som_ref <- sum(variant_values == 0)
cells_MT_alt <- sum(other_variant_values > 0)
cells_MT_ref <- sum(other_variant_values == 0)
}
result <- c(result$p.value, result$estimate, cells_som_alt, cells_som_ref, cells_MT_alt, cells_MT_ref)
} else{
#print("We do not have more than 2 cells for the somatic variant.")
Expand Down
Empty file modified R/CalculateCoverage.R
100755 → 100644
Empty file.
Empty file modified R/CalculateFisherTestPValue.R
100755 → 100644
Empty file.
Empty file modified R/CalculateQuality.R
100755 → 100644
Empty file.
Empty file modified R/CalculateRefReads.R
100755 → 100644
Empty file.
Empty file modified R/CalculateStrandCorrelation.R
100755 → 100644
Empty file.
Empty file modified R/CallSupport.R
100755 → 100644
Empty file.
Empty file modified R/ClonalDefinition.R
100755 → 100644
Empty file.
Empty file modified R/ClonalDiversity.R
100755 → 100644
Empty file.
Empty file modified R/CombineSEobjects.R
100755 → 100644
Empty file.
Empty file modified R/Enrichment_Fisher.R
100755 → 100644
Empty file.
Empty file modified R/Filtering.R
100755 → 100644
Empty file.
Empty file modified R/GetCellInfoPerVariant.R
100755 → 100644
Empty file.
Empty file modified R/GetVariantInfo.R
100755 → 100644
Empty file.
Empty file modified R/HeatmapVOI.R
100755 → 100644
Empty file.
Empty file modified R/LoadingMAEGATK_typewise.R
100755 → 100644
Empty file.
Empty file modified R/LoadingRawMatrix_typewise.R
100755 → 100644
Empty file.
Empty file modified R/LoadingVCF_typewise.R
100755 → 100644
Empty file.
Empty file modified R/LoadingVarTrix_typewise.R
100755 → 100644
Empty file.
Empty file modified R/Merging_SE_list.R
100755 → 100644
Empty file.
Empty file modified R/RowWiseSplit.R
100755 → 100644
Empty file.
Empty file modified R/SeparatingMatrixToList.R
100755 → 100644
Empty file.
Empty file modified R/SetVariantInfo.R
100755 → 100644
Empty file.
Empty file modified R/UMIQC.R
100755 → 100644
Empty file.
Empty file modified R/UMI_seq.R
100755 → 100644
Empty file.
Empty file modified R/VariantBurden.R
100755 → 100644
Empty file.
Empty file modified R/VariantCloneSizeThresholding.R
100755 → 100644
Empty file.
Empty file modified R/VariantCorrelationHeatmap.R
100755 → 100644
Empty file.
Empty file modified R/VariantFisherTestHeatmap.R
100755 → 100644
Empty file.
Empty file modified R/VariantQuantileThresholding_Combined.R
100755 → 100644
Empty file.
Empty file modified R/VariantSelection_Group.R
100755 → 100644
Empty file.
12 changes: 6 additions & 6 deletions R/VariantSelection_Quantile.R
100755 → 100644
Original file line number Diff line number Diff line change
Expand Up @@ -22,24 +22,24 @@ VariantSelection_Quantile <- function(SE, min_coverage = 2, quantiles = c(0.1, 0
SummarizedExperiment::assays(SE)[["fraction"]][nocall_check] <- NA
SummarizedExperiment::assays(SE)[["coverage"]][nocall_check] <- NA
}

if(verbose) print("Get the mean allele frequency and coverage.")
mean_af <- Matrix::rowMeans(SummarizedExperiment::assays(SE)[["fraction"]], na.rm = TRUE)
mean_cov <- Matrix::rowMeans(SummarizedExperiment::assays(SE)[["coverage"]], na.rm = TRUE)

if(verbose) print("Get the quantiles of the VAFs of each variant.")
quantiles <- lapply(quantiles, function(x) apply(SummarizedExperiment::assays(SE)[["fraction"]], 1, quantile, x, na.rm = TRUE))

# Apply min_quality filtering
if(!is.null(min_quality)){
vars <- data.frame(Mean_AF = mean_af, Mean_Cov = mean_cov, VariantQuality = SummarizedExperiment::rowData(SE)$VariantQuality, Quantile1 = quantiles[[1]], Quantile2 = quantiles[[2]])
vars <- subset(vars, !is.na(vars$VariantQuality) & vars$VariantQuality > min_quality)
} else{
vars <- data.frame(Mean_AF = mean_af, Mean_Cov = mean_cov, Quantile1 = quantiles[[1]], Quantile2 = quantiles[[2]])
}

if(verbose) print("Thresholding using the quantile approach.")
vois <- subset(vars, vars$Mean_AF > mean_allele_frequency & vars$Mean_Cov > min_coverage & vars$Quantile1 < thresholds[1] & vars$Quantile2 > thresholds[2])
vois <- subset(vars, vars$Mean_AF > mean_allele_frequency & vars$Mean_Cov > min_coverage & vars$Quantile1 <= thresholds[1] & vars$Quantile2 >= thresholds[2])

return(rownames(vois))
}
Empty file modified R/VariantSelection_TopCells.R
100755 → 100644
Empty file.
21 changes: 16 additions & 5 deletions R/VariantSelection_VMR.R
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,20 @@ VariantSelection_VMR <- function(SE, stabilize_variance = TRUE, low_coverage_thr
fraction[is.na(fraction)] <- 0
mean_af <- Matrix::rowSums(SummarizedExperiment::assays(SE)[["alts"]]) / Matrix::rowSums(SummarizedExperiment::assays(SE)[["coverage"]])
mean_af[is.na(mean_af)] <- 0


if(verbose) print("We calculate the concordance.")
reads_forward <- SummarizedExperiment::assays(SE)[["forward"]]
reads_reverse <- SummarizedExperiment::assays(SE)[["reverse"]]
dt <- merge(data.table::data.table(summary(reads_forward)),
data.table::data.table(summary(reads_reverse)),
by.x = c("i", "j"), by.y = c("i", "j"),
all = TRUE)
dt <- dt[dt[,x.x] > 0 | dt[,x.y] > 0,]
dt <- data.table::data.table(variant = rownames(SE)[dt[[1]]],
cell_id = dt[[2]],
fw = dt[[3]], rev = dt[[4]])
concordance <- dt[, .(cor = suppressWarnings(stats::cor(c(fw), c(rev), method = "pearson", use = "pairwise.complete"))), by = list(variant)]

if(stabilize_variance){
if(verbose) print("We stabilize the variance.")
coverage <- SummarizedExperiment::assays(SE)[["coverage"]]
Expand All @@ -42,19 +55,17 @@ VariantSelection_VMR <- function(SE, stabilize_variance = TRUE, low_coverage_thr

if(verbose) print(paste0("We check if a cells has more than ", minimum_fw_rev_reads, " reads."))
detected <- (SummarizedExperiment::assays(SE)[["forward"]] >= minimum_fw_rev_reads) + (assays(SE)[["reverse"]] >= minimum_fw_rev_reads)

voi <- data.frame(
variant = rownames(detected),
vmr = variants_variance / (mean_af + 0.00000000001),
vmr_log10 = log10(variants_variance / (mean_af + 0.00000000001)),
mean = round(mean_af, 7),
variance = round(variants_variance, 7),
cells_detected = Matrix::rowSums(detected == 2), # We only select cells that have enough reads for both directions.
strand_correlation = SummarizedExperiment::rowData(SE)$Concordance,
strand_correlation = concordance$cor, #SummarizedExperiment::rowData(SE)$Concordance,
mean_coverage = Matrix::rowMeans(SummarizedExperiment::assays(SE)[["coverage"]]),
stringsAsFactors = FALSE, row.names = rownames(detected)
)
return(voi)
}


9 changes: 7 additions & 2 deletions R/VariantWiseCorrelation.R
100755 → 100644
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,9 @@
#'@param n_cores Number of cores you want to use. Numeric.
#'@param p_value_adjustment Method for P value adjustment. See p.adjust for details.
#'@param verbose Should the function be verbose? Default = TRUE
#'@param value_type Are we using consensus or other information?
#'@export
VariantWiseCorrelation <- function(variants_list, n_cores = 1, p_value_adjustment = "fdr", verbose = TRUE){
VariantWiseCorrelation <- function(variants_list, n_cores = 1, p_value_adjustment = "fdr", value_type = "consensus", verbose = TRUE){
# We correlate the somatic variants with each other and the MT variants.
# Since we have tens of thousands of MT variants, we do not correlate them with each other.
variants <- names(variants_list)
Expand All @@ -22,7 +23,7 @@ VariantWiseCorrelation <- function(variants_list, n_cores = 1, p_value_adjustmen
variants_values_use <- variants_list[[variant_use]]
variants_list_use <- variants_list[names(variants_list) != variant_use]
all_variants <- names(variants_list_use)
results <- parallel::mclapply(X = all_variants, CalculateCorrelationPValue, variant_values = variants_values_use, all_variants_list = variants_list_use, mc.cores = n_cores)
results <- parallel::mclapply(X = all_variants, CalculateCorrelationPValue, variant_values = variants_values_use, all_variants_list = variants_list_use, mc.cores = n_cores, value_type = value_type)
results <- do.call("rbind", results)
results <- data.frame(Variant1 = variant_use, Variant2 = all_variants, P = results[,1], Corr = results[,2],
Cells_1_Alt = results[,3], Cells_1_Ref = results[,4], Cells_2_Alt = results[,5], Cells_2_Ref = results[,6])
Expand All @@ -35,6 +36,10 @@ VariantWiseCorrelation <- function(variants_list, n_cores = 1, p_value_adjustmen
if(verbose) print("We remove the negative corrlated SNPs.")
results_total <- subset(results_total, Corr > 0)

if(verbose) print("Remove duplicative results.")
results_check <- apply(results_total[, c("Variant1", "Variant2")], 1, function(x) paste(sort(x), collapse = "_"))
results_total <- results_total[!duplicated(results_check), , drop = FALSE]

if(verbose) print(paste0("Adjusting P values using ", p_value_adjustment, "."))
results_total$P_adj <- stats::p.adjust(results_total$P, method = p_value_adjustment)
rownames(results_total) <- NULL
Expand Down
Empty file modified R/VariantWiseFisherTest.R
100755 → 100644
Empty file.
Empty file modified R/char_to_numeric.R
100755 → 100644
Empty file.
Empty file modified R/combine_NAMES.R
100755 → 100644
Empty file.
Empty file modified R/combine_SparseMatrix.R
100755 → 100644
Empty file.
Empty file modified R/computeAFMutMatrix.R
100755 → 100644
Empty file.
Empty file modified R/getAltMatrix.R
100755 → 100644
Empty file.
Empty file modified R/getMutMatrix.R
100755 → 100644
Empty file.
Empty file modified R/getReadMatrix.R
100755 → 100644
Empty file.
Empty file modified R/getRefMatrix.R
100755 → 100644
Empty file.
Empty file modified R/get_consensus.R
100755 → 100644
Empty file.
Empty file modified R/load_object.R
100755 → 100644
Empty file.
Empty file modified R/save_object.R
100755 → 100644
Empty file.
Empty file modified R/sdiv.R
100755 → 100644
Empty file.
2 changes: 1 addition & 1 deletion README.md
100755 → 100644
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ The mutation data was obtained from the Sanger Institute Catalogue Of Somatic Mu
```

# Current Features v0.3.6
# Current Features v0.3.8

- Loading data from VarTrix and MAEGATK.
- Transforming the data to be compatible for joint analysis.
Expand Down
2 changes: 1 addition & 1 deletion docs/404.html
100755 → 100644

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Empty file modified docs/ReadMe.html
100755 → 100644
Empty file.
Empty file modified docs/_config.yml
100755 → 100644
Empty file.
2 changes: 1 addition & 1 deletion docs/articles/BoneMarrow_SIGURD.html
100755 → 100644

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Empty file.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 032dd81

Please sign in to comment.